The Good Tech Companies - Behind AI Agents: The Infrastructure That Supports Autonomy
Episode Date: January 29, 2025This story was originally published on HackerNoon at: https://hackernoon.com/behind-ai-agents-the-infrastructure-that-supports-autonomy. Learn about the infrastructure t...hat supports orchestration across many moving parts and a long history of data and context needed to build agentic systems. Check more stories related to programming at: https://hackernoon.com/c/programming. You can also check exclusive content about #software-development, #generative-ai, #autonomous-agents, #ai-agents, #autonomy, #component-orchestration, #context-management, #good-company, and more. This story was written by: @datastax. Learn more about this writer by checking @datastax's about page, and for more stories, please visit hackernoon.com. Learn about the infrastructure that supports orchestration across many moving parts and a long history of data and context needed to build agentic systems.
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
Behind AI Agents, the infrastructure that supports autonomy, by Datastacks.
Most descriptions of AI agents and agentic systems focus on agents' ability to act autonomously,
without user intervention, in many situations across the agent's intended use cases.
Some agents operate with a human-in-the-loop model, engaging the user only when they encounter
uncertainty, but still acting autonomously under typical and certain circumstances.
With autonomy being the primary defining feature of AI agents, there are supporting
capabilities that agents need in order to act independently from user input.
In an earlier blog post, we identified four requirements for agentic AI architectures 1.
Ability and access,
the capability to act on behalf of the user, including permissions and authenticated access to relevant systems. Backslash dot 2. Reasoning and planning, using reasoning to make decisions
within a structured thought process, often defined as a chain, tree, graph, or algorithm that guides the agent's actions. Backslash.3.
Component orchestration, coordination of multiple parts, including prompts, LLMs, available data
sources, context, memory, history, and the execution and status of potential actions.
Backslash.4. Guardrails, mechanisms to keep the agent focused and effective,
including safeguards to avoid errors or provide helpful diagnostic information in case of failure. Each of these
four requirements has different infrastructure needs. For ability and access, the primary needs
are software integrations and credential management. Reasoning and planning are mainly
supported by LLMs and other AI models. The topic of guardrails is vast and often specific to the use cases involved,
so we will save that for a future article. Here, I'd like to focus on orchestration,
and the infrastructure needed to support intelligent orchestration across a large
number of moving parts and a long history of data and context that might be needed at decision time.
Component orchestration and the role of context
in AI agents. Assuming that the first two requirements above, including ability, access,
reasoning, and planning, are functioning as intended, the main challenge of component
orchestration boils down to knowledge management. The agentic system needs to maintain awareness on
a variety of levels, its core tasks and goals, the state of various relevant systems,
the history of interactions with the user and other external systems, and potentially more.
With LLMs, we use the concept of a context window to describe the set of information available to
the model, generally at prompt time. This is distinct from the information contained in the
prompt itself and also distinct from Thelm's internal knowledge set that was formed during the model training process. In long texts, context windows
can be thought of as a recent history of information that is available to the LLM at prompt time.
This is implicit in the architecture of LLMs and prompting. In that way, most LLMs have a
one-dimensional concept of context, and older context simply falls out
of the window over time.
Agents need a more sophisticated system for managing context and knowledge, in order to
make sure that the most important or urgent context is made a priority, whenever the agent
needs to make a decision.
Instead of a single monolithic context, AI agents must track different types of context
at varying levels of importance.
This can be compared to memory in computer systems, where different types of storage,
cache, RAM, and hard drives, serve different purposes based on accessibility and frequency
of use. For AI agents, we can conceptually structure context into three primary levels 1.
Primary context, the agent's core task list or goals. This should always be top of mind,
guiding all actions. 2. Direct context, the state of connected,
relevant systems in the immediate environment, including resources like messaging systems,
data feeds, critical APIs, or a user's email and calendars.
3. External context, general knowledge, or any information that might
be relevant, but which is not explicitly designed to be a core part of the agentic system.
External context could be provided by something as simple as a search of the internet or Wikipedia.
Or, it could be urgent and complicated, such as unexpected factors that arise from third-party
news or updates, requiring the agent to adapt its actions dynamically. These levels of context are not definitive, the lines between them can be
very blurry, and there are other useful ways of describing types of context, but this conceptual
structure is useful for our discussion here. Storage infrastructure for context management.
The storage needs of AI agents vary depending on the type of context being managed.
Each level, primary, direct, and external context, requires different data structures,
retrieval mechanisms, and update frequencies. The key challenge is ensuring efficient access,
long-term persistence, and dynamic updates without overloading the agent's processing pipeline.
Rather than treating context as a monolithic entity,
AI agents benefit from hybrid storage architectures that blend structured and unstructured data models.
This allows for fast lookups, semantic retrieval, and scalable persistence,
ensuring that relevant context is available when needed while minimizing redundant data processing.
Primary Context. Task lists and agent goals. The primary context consists of the agent's core objectives and active tasks, the foundation that drives decision-making. This information
must be persistent, highly structured, and easily queryable, as it guides all agent actions.
Potential Storage Needs. Transactional Databases, Key Value or Document Stores,
for Structured Task Lists task lists and goal hierarchies.
Low latency indexing to support quick lookups of active tasks. Event-driven updates to ensure
tasks reflect real-time progress. Example agent implementation a scheduling assistant managing a
task queue needs to store. Persistent tasks, e.g. Schedule a meeting with Alex, with status updates. Execution history, e.g. Sent initial
email, awaiting response. Priorities and dependencies, ensuring urgent tasks are surfaced
first. A distributed, highly available data store ensures that tasks are tracked reliably,
even as the agent processes new events and context updates.
Direct context. State of connected systems direct context includes
the current state of relevant systems, calendars, messaging platforms, APIs, databases, and other
real-time data sources. Unlike primary context, direct context is dynamic and often requires a
combination of structured and real-time storage solutions. Potential storage needs. Time series databases
for event logs and real-time status tracking. Caching layers for frequently accessed system
states. Vector-based retrieval for contextual queries on recent interactions. Example agent
implementation. A customer support AI agent tracking live user interactions needs to store.
Real-time conversation history in an in-memory store.
Session state, E, G. Ongoing support ticket details in a time series database.
API response caches for external system lookups, avoiding redundant queries.
By structuring direct context storage with a combination of time-sensitive and long-term
data stores, AI agents can act with awareness of their environment
without excessive latency. External context. Knowledge retrieval and adaptation external
context encompasses general knowledge and unexpected updates from sources outside the
agent's immediate control. This could range from on-demand search queries to dynamically ingested
external data, requiring a flexible approach to storage and retrieval. Unlike primary
and direct contexts, which are closely tied to the agent's ongoing tasks and connected systems,
external context is often unstructured, vast, and highly variable in relevance.
Potential storage considerations. Document stores and knowledge bases for persistent,
structured reference material. Vector search for querying large datasets of documents, internal or external. Retrieval augmented generation, RAG, to fetch
relevant knowledge before responding. Streaming and event-driven ingestion for real-time updates
from external data sources. Example agent implementation. A personal assistant assembling
a report on the latest scientific discoveries in climate change research needs to
Retrieve scientific articles from external sources, filtering for relevance based on keywords or vector similarity.
Analyze relationships between papers, identifying trends using a knowledge graph.
Summarize key insights using LLM-based retrieval augmented generation.
Track recent updates by subscribing to real-time
publication feeds and news sources. By structuring external context storage around fast retrieval and
semantic organization, AI agents can continuously adapt to new information while ensuring that
retrieved data remains relevant, credible, and actionable. Hybrid storage for context-aware AI
agents. Designing context-aware AI agents requires
a careful balance between efficient access to critical information and avoiding memory
or processing overload. AI agents must decide when to store, retrieve, and process context
dynamically to optimize decision-making. A hybrid storage architecture, integrating
transactional, vector, time-ser series, and event-driven models,
allows AI agents to maintain context persistence, retrieval efficiency, and adaptive intelligence,
all of which are crucial for autonomy at scale. Achieving this balance requires structured
strategies across three key dimensions. 1. Latency versus persistence. Frequently accessed
context, e.g. active task states should reside in low-latency
storage, while less frequently needed but essential knowledge, e.g. Historical interactions
should be retrieved on demand from long-term storage. Backslash dot 2. Structured versus
unstructured data, tasks, goals, and system states benefit from structured storage, e.g. key value or document
databases, while broader knowledge retrieval requires unstructured embeddings and graph
relationships to capture context effectively. 3. Real-time versus historical awareness.
Some contexts require continuous monitoring, e.g. live API responses, whereas others, e.g. prior decisions or reports,
should only be retrieved when relevant to the agent's current task. Given these different
types of contexts, AI agents need a structured approach to storing and accessing information.
Relying solely on LLM context windows is inefficient, as it limits the agent's ability
to track long-term interaction sand evolving situations. Instead, context should be persistently stored, dynamically retrieved,
and prioritized based on relevance and urgency. Primary context, tasks and goals, stored in
transactional databases for structured tracking and referenced in every inference cycle.
Direct context, system state and active data, maintained
in real-time through caching, time series storage, or event-driven updates. External context,
knowledge and dynamic updates, queried on demand via vector search, retrieval augmented generation,
RAG, or graph-based knowledge representation. In practice, multi-tiered memory models combining
short-term caches, persistent databases, and external retrieval mechanisms are required for scalable AI agent architectures.
By leveraging a hybrid storage approach, AI agents can maintain real-time awareness of active systems,
retrieve historical knowledge only when relevant, dynamically adjust priorities based on evolving needs. By integrating these storage strategies,
AI agents can function autonomously, retain contextual awareness over long periods,
and respond dynamically to new information, laying the foundation for truly intelligent
and scalable agentic systems. Hybrid storage solutions. Implementing a hybrid storage
architecture for AI agents requires selecting the right databases and storage tools to handle different types of contexts efficiently.
The best choice depends on factors such as latency requirements, scalability, data structure
compatibility, and retrieval mechanisms.
A well-designed AI agent storage system typically includes transactional databases for structured,
persistent task tracking, time-series and event-driven storage for real-time system state monitoring
Vector search and knowledge retrieval for flexible, unstructured data access
Caching and in-memory databases for rapid short-term memory access
Let's take a closer look at each of these elements.
Transactional and distributed DATA BASE SAI agents require scalable, highly available
transactional databases to store tasks, goals, and structured metadata reliably.
These databases ensure that primary context is always available and efficiently queryable.
Apache Cassandra A distributed NoSQL database designed for high availability and fault tolerance.
Ideal for managing structured task lists and agent goal tracking at scale. Datastacks AstraDB, a managed database as a
service, DBAAS, built on Cassandra, providing elastic scalability and multi-region replication
for AI applications requiring high durability. PostgreSQL, a popular relational database with
strong consistency guarantees,
well-suited for structured agent metadata, persistent task logs, and policy enforcement.
Time-series and event-driven storage for real-time system monitoring.
AI agents need databases optimized for logging, event tracking, and state persistence.
InfluxDB, a leading time-series database designed for high-speed ingestion and efficient queries,
making it ideal for logging AI agent activity and external system updates.
TimescaleDB, a PostgreSQL extension optimized for time series workloads, suitable for tracking
changes in AI agent workflows and system events.
Apache Kafka plus KSQLDB, a streaming data platform that allows AI agents to consume,
process, and react to real-time events efficiently.
Redis Streams, a lightweight solution for real-time event handling and message queuing,
useful for keeping AI agents aware of new updates as they happen.
Vector Search for Knowledge RETRIEVALAI agents working with unstructured knowledge require efficient ways
to store, search, and retrieve embeddings for tasks like semantic search, similarity matching,
and retrieval augmented generation, RAG. A well-optimized vector search system enables
agents to recall relevant past interactions, documents, or facts without overloading memory
or context windows. Datastacks AstraDB, a scalable, managed vector database built on Cassandra,
offering high-performance similarity search and multimodal retrieval.
Astra combines distributed resilience with vector search capabilities,
making it a top choice for AI agents that need to process embeddings efficiently while
ensuring global scalability and high availability. Weavey 8, a cloud-native vector database designed for semantic search and multimodal data retrieval.
It supports hybrid search methods and integrates well with knowledge graphs,
making it useful for AI agents that rely on contextual reasoning.
FAISS, Facebook AI Similarity Search, an open-source library for high-performance
nearest-neighbor
search, often embedded in AI pipelines for fast vector lookups on large datasets.
While not a full database, FAISS provides a lightweight, high-speed solution for local
similarity search. Caching and in-memory storage AI agents require low-latency access to frequently
referenced context, making caching an essential component of hybrid storage architectures. Redis, a high-performance in-memory key value store,
widely used for short-term context caching and session management in AI agents.
Memcached, a simple but effective distributed caching system that provides rapid access to
frequently used AI agent data. By integrating these diverse storage solutions,
AI agents can efficiently manage short-term memory, persistent knowledge, and real-time updates,
ensuring seamless decision-making at scale. The combination of transactional databases,
time-series storage, vector search, and caching allows agents to balance speed,
scalability, and contextual awareness, adapting dynamically
to new inputs. As AI-driven applications continue to evolve, selecting the right hybrid storage
architecture will be crucial for enabling autonomous, responsive, and intelligent agentic
systems that can operate reliably in complex and ever-changing environments.
The future of AI agents with hybrid databases As AI systems grow more complex, hybrid databases
will be crucial for managing short-term and long-term memory, structured and unstructured
data, and real-time and historical insights. Advances in retrieval augmented generation,
RAG, semantic indexing, and distributed inference are making AI agents more efficient,
intelligent, and adaptive. Future AI agents will rely on fast,
scalable, and context-aware storage to maintain continuity and make informed decisions over time.
Why hybrid databases? AI agents need storage solutions that efficiently manage different
types of context while ensuring speed, scalability, and resilience. Hybrid database software the best
of both worlds, high-speed structured data with
deep contextual retrieval, making them foundational for intelligent AI systems.
They support vector-based search for long-term knowledge storage, low-latency transactional
lookups, real-time event-driven updates, and distributed scalability for fault tolerance.
Building a scalable AI data infrastructure. To support intelligent AI agents,
developers should design storage architectures that combine multiple data models for seamless
context management. Vector search and columnar data store semantic context alongside structured
metadata for fast retrieval. Event-driven workflows stream real-time updates to keep AI
agents aware of changing data.
Global scale and resilience deploy across distributed networks for high availability and fault tolerance. By integrating transactional processing, vector search, and real-time updates,
hybrid databases like Datastacks AstraDB provide the optimal foundation for AI agent memory,
context awareness, and decision-making. As iDriven applications
evolve, hybrid storage solutions will be essential for enabling autonomous, context-rich AI agents
that operate reliably in dynamic, data-intensive environments. Written by Brian Godsey,
Datastacks thank you for listening to this HackerNoon story, read by Artificial Intelligence.
Visit HackerNoon.com to read, write, learn and publish.