In Chapter 4: CoreLLM (LLM Abstraction), we saw how CoreLLM acts like a universal remote, letting our AgentNode talk to different AI brains (LLMs) easily. The AgentNode can ask the CoreLLM to generate text, and it can also give the CoreLLM a list of Tools it might want to use.
But this raises a question: How does the agent decide when to just talk versus when to use a specific tool? And can we guide this process to make the agent behave in a more structured way?
Imagine you want an agent to help you research a topic. A good process might be:
- Search: First, use a search tool to gather basic information.
- Think: Then, use its AI brain (LLM) to structure and analyze that information.
- Reflect: Finally, use the LLM again to summarize the key findings and potential biases.
If we just let the agent decide freely, it might jump straight to reflecting without searching, or it might search multiple times randomly. How can we ensure it follows these specific steps in the right order?
This is where the OrchestrationManager comes in. Think of it as the conductor of an orchestra. The agent's capabilities (talking via CoreLLM, using various Tools) are like the different instruments. The OrchestrationManager doesn't play any instruments itself, but it holds the musical score (the rules) and guides the musicians (the agent's capabilities) on when and how to play.
It helps create more structured, predictable, and controllable agent behavior by:
- Defining different stages or steps in a conversation.
- Specifying which tools are allowed at each step.
- Sometimes enforcing a strict sequence of tools that must be used.
These rules are defined in a special section called orchestration within the agent's blueprint, the Agent Configuration (AgentConfig).
Let's break down the "musical score" – the OrchestrationConfig – that the OrchestrationManager reads. This configuration is typically part of the agent's template.json file.
This is the main JSON object within template.json that defines the rules. Here's a simplified example from the "Cognitive Reasoner" agent, which is designed for complex thinking tasks:
// Simplified from agents/cognitive-reasoner/template.json
{
// ... other AgentConfig properties ...
"orchestration": {
"description": "Guides the agent to select a mode (sequence) based on the query.",
"steps": [
// ... Step definitions go here ...
]
}
// ... other AgentConfig properties ...
}Inside the orchestration block, you define a list of steps. Each step represents a distinct phase or mode of the conversation.
name: A unique identifier for the step (e.g., "ResearchMode", "ProblemSolvingMode").description: Explains what happens in this step.isDefault: Iftrue, this step is used when no other step's conditions are met. Think of it as the starting or fallback stage.
// Simplified step definition
{
"name": "ResearchMode",
"description": "Agent focuses on searching and analyzing.",
// ... other properties like conditions, sequence ...
},
{
"name": "DefaultMode",
"description": "General mode, no specific restrictions.",
"isDefault": true
}Some steps might require the agent to use a specific set of tools in a precise order. This is defined using the sequence property within a step.
// Simplified step with a sequence
{
"name": "ResearchMode",
"description": "Research using Search -> Think -> Reflect.",
"sequence": [
"search", // Must use 'search' first
"think", // Then must use 'think'
"reflect" // Finally must use 'reflect'
],
// ... other properties ...
}If a step has a sequence, the OrchestrationManager will only allow the agent to use the next required tool in that list.
How does the agent move from one step to another? conditions define the rules for activating a specific step.
type: The kind of condition. Common types:tool_used: Becomes true if a specific tool (value) was recently used.sequence_match: Becomes true if the sequence of recently used tools matches thesequencedefined in this step (useful for triggering a step after a sequence completes).
value: The value to check against (e.g., the tool name fortool_used).
// Simplified step with a condition
{
"name": "AnalysisMode",
"description": "Activates after the 'search' tool has been used.",
"conditions": [
{ "type": "tool_used", "value": "search" }
]
// ... other properties ...
}The OrchestrationManager checks these conditions to determine which step is currently active.
Within a step, even if there isn't a strict sequence, you can still control which tools the agent is allowed or forbidden to use.
allowed: A list of tool names. Only these tools can be used in this step.denied: A list of tool names. These tools cannot be used in this step.
// Simplified step with tool availability rules
{
"name": "CreativeMode",
"description": "Focus on brainstorming, no searching allowed.",
"availableTools": {
"allowed": ["think", "brainstorm"], // Only these are okay
"denied": ["search"] // Explicitly forbid search
}
}If a step has both sequence and availableTools, the sequence takes priority – only the next tool in the sequence is allowed. availableTools is more commonly used in steps without a sequence.
The OrchestrationManager is the actual system component (a class in the code) that puts all these rules into action during a conversation.
- It reads the
OrchestrationConfigfor the specific agent. - It keeps track of the current state for each user session (which step is active? what tools were used recently? how far into a sequence are we?). This state is managed by an internal helper called
OrchestrationStateManager. - It evaluates the
conditionsto determine the currentactiveStep. - It filters the list of all possible tools based on the active step's
sequenceoravailableTools. - It updates the state when a tool is used (e.g., adds the tool to history, advances the sequence index).
Let's follow the flow when a user sends a message to an agent using orchestration:
- User Message: You send a message, e.g., "Research the impact of AI on jobs."
- AgentNode Receives: The agent's main Nodes (
BaseNode,AgentNode) receives the message. - Check Orchestration: Before calling the CoreLLM (LLM Abstraction), the
AgentNodeasks theOrchestrationManager: "For this user session, given theOrchestrationConfig, what's the current state and which tools are allowed right now?" - Manager Evaluates: The
OrchestrationManagerretrieves the session's state (usingOrchestrationStateManager). Let's say the state indicates the "ResearchMode" step is active, and thesequencerequires "search" next. The manager determines that only the "search" tool is currently allowed. - Filtered Tools: The
OrchestrationManagertells theAgentNode: "Only allow the 'search' tool." - LLM Call: The
AgentNodecallsCoreLLM.streamText(), providing the user message, conversation history, and only the allowed tools (in this case, just "search"). - LLM Decides: The LLM analyzes the request ("Research...") and sees that the "search" tool is available and appropriate. It decides to use it.
- Tool Execution: The system executes the "search" tool.
- Tool Result & State Update: The "search" tool returns its results. The
AgentNode(often via a helper service likeLLMOrchestrationService) informs theOrchestrationManager: "The 'search' tool was just used for this session." - Manager Updates State: The
OrchestrationManagerupdates the session state: adds "search" to therecentlyUsedToolshistory and advances thesequenceIndexfor "ResearchMode" (now expecting "think"). - LLM Continues: The
AgentNodegives the search results back to theCoreLLM. - Next Interaction: Now, if the LLM needs to use another tool, the
AgentNodewill again ask theOrchestrationManager. This time, the manager will see the state expects "think" next in the sequence and will only allow the "think" tool. - Final Response: The process continues until the sequence is complete or the LLM generates a final text response to the user.
Let's see how the OrchestrationManager coordinates this.
High-Level Flow Diagram:
sequenceDiagram
participant User
participant AgentNode
participant OrchManager as OrchestrationManager
participant OrchState as OrchestrationStateManager
participant CoreLLM
User->>AgentNode: Send message ("Research...")
AgentNode->>OrchManager: Get Allowed Tools (SessionID, Config)
OrchManager->>OrchState: Get Current State (SessionID)
OrchState-->>OrchManager: Return State (e.g., step=ResearchMode, index=0)
Note right of OrchManager: Sequence expects 'search'
OrchManager-->>AgentNode: Return Allowed Tools (['search'])
AgentNode->>CoreLLM: Generate Response (with ['search'])
CoreLLM-->>AgentNode: Request to use 'search'
Note over AgentNode,CoreLLM: Tool 'search' executes...
AgentNode->>OrchManager: Process Tool Usage ('search', SessionID, Config)
OrchManager->>OrchState: Update State (add 'search' to history, index=1)
OrchState-->>OrchManager: Confirm Update
OrchManager-->>AgentNode: Acknowledge Tool Processed
AgentNode->>CoreLLM: Provide 'search' result
CoreLLM-->>AgentNode: Generate response / Request next tool ('think')
AgentNode->>User: Send response
Key Code Components:
-
OrchestrationManager(agentdock-core/src/orchestration/index.ts): The main class coordinating the logic.// Simplified from agentdock-core/src/orchestration/index.ts import { OrchestrationStateManager, createOrchestrationStateManager } from './state'; import { StepSequencer, createStepSequencer } from './sequencer'; // ... other imports export class OrchestrationManager { private stateManager: OrchestrationStateManager; private sequencer: StepSequencer; constructor(options: OrchestrationManagerOptions = {}) { // Gets or creates the state manager (handles storage) this.stateManager = createOrchestrationStateManager(options); // Gets or creates the sequencer (handles sequence logic) this.sequencer = createStepSequencer(this.stateManager); } // Determines the current step based on conditions and state async getActiveStep(config, messages, sessionId): Promise<OrchestrationStep | undefined> { const state = await this.stateManager.getOrCreateState(sessionId); // ... logic to check conditions against state.recentlyUsedTools ... // ... finds matching step or default step ... // ... updates state.activeStep if changed ... return activeStep; } // Determines allowed tools based on the active step's rules async getAllowedTools(config, messages, sessionId, allToolIds): Promise<string[]> { const activeStep = await this.getActiveStep(config, messages, sessionId); if (!activeStep) return allToolIds; // No rules, allow all // If sequence, ask the sequencer for the next tool if (activeStep.sequence?.length) { return this.sequencer.filterToolsBySequence(activeStep, sessionId, allToolIds); } // Otherwise, apply allowed/denied rules // ... logic using activeStep.availableTools ... return filteredTools; } // Updates state after a tool is used async processToolUsage(config, messages, sessionId, toolName): Promise<void> { const activeStep = await this.getActiveStep(config, messages, sessionId); if (!activeStep) return; // Tell the sequencer (which updates state via stateManager) await this.sequencer.processTool(activeStep, sessionId, toolName); // Re-check if the step should change now that the tool was used await this.getActiveStep(config, messages, sessionId); } // Gets the current state (used for conditions, etc.) async getState(sessionId: SessionId): Promise<AIOrchestrationState | null> { return await this.stateManager.toAIOrchestrationState(sessionId); } // Updates arbitrary parts of the state async updateState(sessionId, partialState) { return await this.stateManager.updateState(sessionId, partialState); } }
Explanation:
- The
OrchestrationManageruses helper classes:OrchestrationStateManager(to load/save the current status likeactiveStep,recentlyUsedTools,sequenceIndex) andStepSequencer(to handle the specific logic of enforcingsequencerules). getActiveStepfinds the right step based on rules and history.getAllowedToolsfilters tools based on the active step's sequence or allow/deny lists.processToolUsagerecords that a tool was used and advances any active sequence.
- The
-
OrchestrationStateManager(agentdock-core/src/orchestration/state.ts): Handles saving and loading the orchestration state for each session. It uses the Session Management (SessionManager) and Storage (StorageProvider,StorageFactory) components discussed later to persist this state.// Simplified concept from agentdock-core/src/orchestration/state.ts import { SessionManager } from '../session'; // ... export class OrchestrationStateManager { private sessionManager: SessionManager<OrchestrationState>; // Uses SessionManager! constructor(/* ... options including storage ... */) { // ... initializes sessionManager with storage ... } // Gets state (or creates if missing) using SessionManager async getOrCreateState(sessionId): Promise<OrchestrationState | null> { const result = await this.sessionManager.getSession(sessionId); if (result.success && result.data) return result.data; // ... handle creation if needed ... return newState; } // Updates state using SessionManager async updateState(sessionId, updates): Promise<OrchestrationState | null> { const updateFn = (currentState) => ({ ...currentState, ...updates }); const result = await this.sessionManager.updateSession(sessionId, updateFn); return result.data; } // Adds tool to history (within updateState usually) async addUsedTool(sessionId, toolName) { /* ... */ } // Converts internal state to AI-facing state async toAIOrchestrationState(sessionId): Promise<AIOrchestrationState | null> { /* ... */ } }
-
StepSequencer(agentdock-core/src/orchestration/sequencer.ts): Focuses specifically on managing thesequencelogic within a step.// Simplified concept from agentdock-core/src/orchestration/sequencer.ts import { OrchestrationStateManager } from './state'; // ... export class StepSequencer { private stateManager: OrchestrationStateManager; // Filters tools: returns ONLY the next expected tool if in a sequence async filterToolsBySequence(step, sessionId, allToolIds): Promise<string[]> { const state = await this.stateManager.getState(sessionId); const currentIndex = state?.sequenceIndex ?? 0; const expectedTool = step.sequence?.[currentIndex]; if (expectedTool && allToolIds.includes(expectedTool)) { return [expectedTool]; // Only allow the expected tool } return allToolIds; // Sequence done or tool unavailable? Allow all/none. } // Processes tool use: advances sequence index if the tool matches async processTool(step, sessionId, usedTool): Promise<boolean> { await this.stateManager.addUsedTool(sessionId, usedTool); // Always track history const state = await this.stateManager.getState(sessionId); const currentIndex = state?.sequenceIndex ?? 0; const expectedTool = step.sequence?.[currentIndex]; if (step.sequence && expectedTool === usedTool) { // Advance the index in the state await this.stateManager.updateState(sessionId, { sequenceIndex: currentIndex + 1 }); return true; } return false; // Tool didn't match sequence } }
By separating concerns (Manager for overall coordination, StateManager for persistence, Sequencer for sequence logic), the system remains organized and easier to manage.
You've learned about the OrchestrationManager, AgentDock's "conductor" for controlling agent behavior!
- It allows you to define structured workflows using
steps,conditions,sequences, andavailableToolswithin the agent'sOrchestrationConfig. - It acts as a gatekeeper, determining which tools the agent is allowed to use at any given moment based on the current state and rules.
- It helps create more predictable and reliable agents, especially for tasks requiring specific multi-step processes.
- It relies on
OrchestrationStateManagerto remember the state for each conversation andStepSequencerto handle mandatory tool orders.
Understanding orchestration unlocks the ability to build sophisticated agents that follow specific protocols or workflows.
Now that we've covered the core components of an agent (Config, Tools, Nodes, LLM, Orchestration), how does an external user actually interact with an agent? The next chapter explains how AgentDock exposes agents through a web API.
Next: Chapter 6: API Route (/api/chat/[agentId]/route.ts)
Generated by AI Codebase Knowledge Builder