005: Agent/Data
- Requires:
- Enables:
- Complemented by:
A persistent context message containing a data value and an optional schema. It is retained across an agent's execution loop to provide a stable, structured context.
The Data Protocol is a low-level pattern for providing structured, self-describing information. It serves as a foundational mechanism used by other subsystems, such as Input and State, to manage structured data within an agent's context. Unlike ephemeral messages, Data messages are persistent and are retained across multiple steps of an agent's process, providing a stable context for multi-step tasks.
The Data Message
A Data message is a simple construct for adding structured context to a Request. It is a message object that contains the following properties:
data: Any JSON value (e.g., string, number, object, array) that contains the payload.schema: An optional JSON Schema that defines the structure and semantic meaning of thedata.kind: An optional string that identifies the message's role (e.g.,"state","input"). This allows the system and the LLM to differentiate between various types of data within the same context.
By pairing data with an optional schema, a Data message makes the context machine-readable. The schema acts as a blueprint, explaining to the LLM what each property means, what its type is, and what constraints apply. This not only guides the LLM's reasoning but also serves as a form of documentation, showing users what is possible to change or configure.
Merging and Identity
The protocol is designed to handle multiple Data messages within a single context. If the system decides that multiple messages share the same identity, they are merged into a single, coherent object. This is particularly useful for scenarios like applying a series of state patches.
A message's identity is primarily determined by its kind. For example, all messages of kind: "state" without any other distinguishing features are considered to have the same identity. This identity can be further specified by other protocols, most notably Instancing, which allows for parallel processing by creating distinct contexts.
When multiple mergeable messages are present (e.g., several state objects representing patches), the system can handle this in two ways. First, the agent's execution logic can explicitly merge these objects into a single, coherent state object before presenting it to the LLM. This reduces the cognitive load on the model. Second, the LLM itself can "mentally merge" the information in its latent space, understanding that the separate messages represent different facets of a single concept.
Example of how data message are seen by LLM
- The
textmessage is passed through as-is. - The
datamessages are merged and transformed into a ingletextmessage.
What the code looks like
Agent.Request(config, schema, [
{
type: 'text',
text: "Update the user's city to Austin",
},
// Schema is serialized for LLM to understand semantics
{
type: 'data',
kind: 'user',
description: 'Represents the current user.',
data: { name: 'John Doe' },
schema: {
type: 'object',
properties: {
name: { type: 'string' },
age: { type: 'number' },
city: { type: 'string' },
},
},
},
// Second `user` message will merge with the first
{
type: 'data',
kind: 'user',
data: { age: 30 },
},
]);
What the LLM Sees
[
{
role: 'user',
content: {
type: 'text',
text: "Update the user's city to Austin",
},
},
{
role: 'user',
content: {
type: 'text',
text: `
## Data: ¶user
{
"name": "John Doe",
"age": 30
}
Represents the current user.
Schema for ¶user:
{
"type": "object",
"properties": {
"name": { "type": "string" },
"age": { "type": "number" },
"city": { "type": "string" }
}
}`,
},
},
];
Specializing the Data Message
The generic Data message is a foundational pattern. It is specialized for different roles by other parts of the system, often by assigning a specific kind.
-
Input Message: An Input Message is a Data message with
kind: 'input'. It formally declares the parameters a Request accepts, turning a static Request into a reusable, function-like component. -
State Message: A State Message is a Data message with
kind: 'state'. It represents the persistent, evolving memory of a workflow. Itsschemadefines the structure of this memory, including what properties are available to be read from or written to. -
Output: The
_outputPathmechanism creates new Data messages to persist the results of Tool Calls. When aCallwith an_outputPathis executed, its result is appended to the context as a new Data message (often withkind: 'state'), making it available for subsequent steps. -
Instancing: The Instancing system uses the
_instanceproperty to distinguish Data messages. This key scopes a message to a specific execution thread in a batch operation. Data messages with different_instancevalues are treated as having different identities and will not be merged, ensuring data isolation. -
Loop: The Execution Loop relies on Data messages to maintain continuity. Specifically, the State Message is the primary vehicle for persisting information across the ticks of a Loop.
-
Variables: The Variable system is the primary consumer of Data messages. Variable References (
†<kind>.<path>) are used within Tool Calls to dynamically read values from Data messages in the context, such as an Input Message or a State Message.
From Generic Data to Specific Roles
The Data message provides a generic container for structured information. However, for an agent to use this data effectively, it needs to understand the data's role in the process. The following chapters describe how this generic Data message is specialized for specific purposes, such as providing initial parameters for a workflow.
006: Agent/Input describes how a Data message is used to create a structured prompt for an agent.