Acts of Emergence

005: Agent/Data

A persistent context message containing a data value and an optional schema. It is retained across an agent's execution loop to provide a stable, structured context.

The Data Protocol is a low-level pattern for providing structured, self-describing information. It serves as a foundational mechanism used by other subsystems, such as Input and State, to manage structured data within an agent's context. Unlike ephemeral messages, Data messages are persistent and are retained across multiple steps of an agent's process, providing a stable context for multi-step tasks.

The Data Message

A Data message is a simple construct for adding structured context to a Request. It is a message object that contains the following properties:

  • data: Any JSON value (e.g., string, number, object, array) that contains the payload.
  • schema: An optional JSON Schema that defines the structure and semantic meaning of the data.
  • kind: An optional string that identifies the message's role (e.g., "state", "input"). This allows the system and the LLM to differentiate between various types of data within the same context.

By pairing data with an optional schema, a Data message makes the context machine-readable. The schema acts as a blueprint, explaining to the LLM what each property means, what its type is, and what constraints apply. This not only guides the LLM's reasoning but also serves as a form of documentation, showing users what is possible to change or configure.

Merging and Identity

The protocol is designed to handle multiple Data messages within a single context. If the system decides that multiple messages share the same identity, they are merged into a single, coherent object. This is particularly useful for scenarios like applying a series of state patches.

A message's identity is primarily determined by its kind. For example, all messages of kind: "state" without any other distinguishing features are considered to have the same identity. This identity can be further specified by other protocols, most notably Instancing, which allows for parallel processing by creating distinct contexts.

When multiple mergeable messages are present (e.g., several state objects representing patches), the system can handle this in two ways. First, the agent's execution logic can explicitly merge these objects into a single, coherent state object before presenting it to the LLM. This reduces the cognitive load on the model. Second, the LLM itself can "mentally merge" the information in its latent space, understanding that the separate messages represent different facets of a single concept.

Example of how data message are seen by LLM

  • The text message is passed through as-is.
  • The data messages are merged and transformed into a ingle text message.

What the code looks like

Agent.Request(config, schema, [
  {
    type: 'text',
    text: "Update the user's city to Austin",
  },

  // Schema is serialized for LLM to understand semantics
  {
    type: 'data',
    kind: 'user',
    description: 'Represents the current user.',
    data: { name: 'John Doe' },
    schema: {
      type: 'object',
      properties: {
        name: { type: 'string' },
        age: { type: 'number' },
        city: { type: 'string' },
      },
    },
  },

  // Second `user` message will merge with the first
  {
    type: 'data',
    kind: 'user',
    data: { age: 30 },
  },
]);

What the LLM Sees

[
  {
    role: 'user',
    content: {
      type: 'text',
      text: "Update the user's city to Austin",
    },
  },
  {
    role: 'user',
    content: {
      type: 'text',
      text: `
        ## Data: ¶user
        {
          "name": "John Doe",
          "age": 30
        }
        Represents the current user.
        Schema for ¶user:
        {
          "type": "object",
          "properties": {
            "name": { "type": "string" },
            "age": { "type": "number" },
            "city": { "type": "string" }
          }
        }`,
    },
  },
];

Specializing the Data Message

The generic Data message is a foundational pattern. It is specialized for different roles by other parts of the system, often by assigning a specific kind.

From Generic Data to Specific Roles

The Data message provides a generic container for structured information. However, for an agent to use this data effectively, it needs to understand the data's role in the process. The following chapters describe how this generic Data message is specialized for specific purposes, such as providing initial parameters for a workflow.

006: Agent/Input describes how a Data message is used to create a structured prompt for an agent.