005: Agent/Loop
- Requires:
This document describes the Execution Loop, which enables an agent to perform multi-step tasks by iteratively making Requests. This iterative process of context assembly, tool use, and feedback is what is commonly meant when referring to an "agent."
The Execution Loop
The execution loop is the primary mechanism for autonomous, multi-step execution. It operates via a nested loop structure:
- Outer Loop (Request Generation): The agent's lifecycle is a sequence of Requests. It starts with an initial context and enters a loop.
- Request & Call Streaming: Inside the loop, it invokes a single Request. The Request streams back Calls as they are generated, which are collected into a pending queue.
- Inner Loop (Call Orchestration): For each Request, an inner orchestration loop is responsible for executing its associated Calls. This process is highly concurrent:
- The orchestrator continuously scans the queue of pending Calls to find all that are currently unblocked (i.e., their dependencies are met).
- All unblocked Calls can be presented for confirmation and then executed in parallel. This concurrency is safe because the agent's State is immutable: once a value is written to a specific path via
_outputPath
, it cannot be overwritten. This allows the model to propose mutually exclusive Calls—such as different branches of a conditional—that write to the same output path. The first of these calls to execute successfully sets the value, and any others that were alternatives will not be executed because their preconditions (the path being empty) are no longer met. This ensures a deterministic outcome without conflicts. - As each Call completes, its output updates the shared context, potentially unblocking other pending Calls.
- This reactive, parallel execution continues until the stream for the current Request is closed and all of its pending Calls have been drained. This model significantly reduces latency, as the agent can start working on multiple independent steps simultaneously, even before the full plan is known.
- Termination Check: Once the inner loop completes, the agent inspects the final Solution from the parent Request. If it contains no Calls, the agent's goal is considered complete, and the outer loop terminates.
- Continuation: If the Solution did contain Calls, the agent loops back to step 2, invoking a new Request with the enriched context that now contains the results of the previous execution step.
- Output Generation: Upon termination, the
output
field of the final Solution contains the result, conforming to the user-defined output schema.
Human-in-the-Loop
The new Execution Loop provides robust support for human oversight by placing the confirmation step just before execution. This ensures the user is only prompted to act on calls that are ready to run:
- Approval: Before an unblocked Call is executed, the system can present it to a user for approval. This is an efficient approach, as it prevents the user from having to review and confirm calls that might be blocked by dependencies and never run.
- Correction: The user can modify the parameters of a Call or even replace it with a different one
It is important to note that these specific HITL mechanisms are not part of the core protocol. The architecture simply provides the necessary separation between proposing actions and executing them, giving developers the flexibility to implement any kind of intervention, from a simple manual approval to a complex, automated system with timeouts.
This capability is critical for safety and for collaborative tasks where the agent acts as an assistant. User adjustments and feedback can be leveraged by the Plan, allowing the agent to refine its strategy based on human input.
The Role of Data in the Loop
The Execution Loop provides a dynamic structure for agent behavior, but its power comes from the data flowing within it. This is managed by the Data message, which is explored in 006: Agent/Data.