005: Agent/Data

Requires:
- 001: Agent/Request
Enables:
- 006: Agent/Input
- 009: Agent/State
Complemented by:
- 010: Agent/Loop
- 013: Agent/Instancing

A persistent context message containing a data value and an optional schema. It is retained across an agent's execution loop to provide a stable, structured context.

The Data Protocol is a low-level pattern for providing structured, self-describing information. It serves as a foundational mechanism used by other subsystems, such as Input and State, to manage structured data within an agent's context. Unlike ephemeral messages, Data messages are persistent and are retained across multiple steps of an agent's process, providing a stable context for multi-step tasks.

001: Agent/Request

The Data Message

A Data message is a simple construct for adding structured context to a Request. It is a message object that contains the following properties:

data: Any JSON value (e.g., string, number, object, array) that contains the payload.
schema: An optional JSON Schema that defines the structure and semantic meaning of the data.
kind: An optional string that identifies the message's role (e.g., "state", "input"). This allows the system and the LLM to differentiate between various types of data within the same context.

By pairing data with an optional schema, a Data message makes the context machine-readable. The schema acts as a blueprint, explaining to the LLM what each property means, what its type is, and what constraints apply. This not only guides the LLM's reasoning but also serves as a form of documentation, showing users what is possible to change or configure.

Merging and Identity

The protocol is designed to handle multiple Data messages within a single context. If the system decides that multiple messages share the same identity, they are merged into a single, coherent object. This is particularly useful for scenarios like applying a series of state patches.

013: Agent/Instancing

A message's identity is primarily determined by its kind. For example, all messages of kind: "state" without any other distinguishing features are considered to have the same identity. This identity can be further specified by other protocols, most notably Instancing, which allows for parallel processing by creating distinct contexts.

When multiple mergeable messages are present (e.g., several state objects representing patches), the system can handle this in two ways. First, the agent's execution logic can explicitly merge these objects into a single, coherent state object before presenting it to the LLM. This reduces the cognitive load on the model. Second, the LLM itself can "mentally merge" the information in its latent space, understanding that the separate messages represent different facets of a single concept.

Example of how data message are seen by LLM

The text message is passed through as-is.
The data messages are merged and transformed into a ingle text message.

What the code looks like

Agent.Request(config, schema, [
  {
    type: 'text',
    text: "Update the user's city to Austin",
  },

  // Schema is serialized for LLM to understand semantics
  {
    type: 'data',
    kind: 'user',
    description: 'Represents the current user.',
    data: { name: 'John Doe' },
    schema: {
      type: 'object',
      properties: {
        name: { type: 'string' },
        age: { type: 'number' },
        city: { type: 'string' },
      },
    },
  },

  // Second `user` message will merge with the first
  {
    type: 'data',
    kind: 'user',
    data: { age: 30 },
  },
]);

What the LLM Sees

[
  {
    role: 'user',
    content: {
      type: 'text',
      text: "Update the user's city to Austin",
    },
  },
  {
    role: 'user',
    content: {
      type: 'text',
      text: `
        ## Data: ¶user
        {
          "name": "John Doe",
          "age": 30
        }
        Represents the current user.
        Schema for ¶user:
        {
          "type": "object",
          "properties": {
            "name": { "type": "string" },
            "age": { "type": "number" },
            "city": { "type": "string" }
          }
        }`,
    },
  },
];

Specializing the Data Message

The generic Data message is a foundational pattern. It is specialized for different roles by other parts of the system, often by assigning a specific kind.

- 006: Agent/Input
Input Message: An Input Message is a Data message with kind: 'input'. It formally declares the parameters a Request accepts, turning a static Request into a reusable, function-like component.
- 009: Agent/State
State Message: A State Message is a Data message with kind: 'state'. It represents the persistent, evolving memory of a workflow. Its schema defines the structure of this memory, including what properties are available to be read from or written to.
- 008: Agent/Output
Output: The _outputPath mechanism creates new Data messages to persist the results of Tool Calls. When a Call with an _outputPath is executed, its result is appended to the context as a new Data message (often with kind: 'state'), making it available for subsequent steps.
- 013: Agent/Instancing
Instancing: The Instancing system uses the _instance property to distinguish Data messages. This key scopes a message to a specific execution thread in a batch operation. Data messages with different _instance values are treated as having different identities and will not be merged, ensuring data isolation.
- 010: Agent/Loop
Loop: The Execution Loop relies on Data messages to maintain continuity. Specifically, the State Message is the primary vehicle for persisting information across the ticks of a Loop.
- 007: Agent/Variables
Variables: The Variable system is the primary consumer of Data messages. Variable References (†<kind>.<path>) are used within Tool Calls to dynamically read values from Data messages in the context, such as an Input Message or a State Message.

From Generic Data to Specific Roles

The Data message provides a generic container for structured information. However, for an agent to use this data effectively, it needs to understand the data's role in the process. The following chapters describe how this generic Data message is specialized for specific purposes, such as providing initial parameters for a workflow.

006: Agent/Input describes how a Data message is used to create a structured prompt for an agent.