Skip to main content
The Multi-Agent mode transforms /coder and /agent into an orchestration system where the LLM dispatches specialist agents in parallel to solve complex tasks faster and more efficiently.

Activation

The multi-agent mode is enabled by default. To disable it, set:
CHATCLI_AGENT_PARALLEL_MODE=false
When disabled, /coder and /agent work exactly as before — with no impact.

Architecture

User Query
    |
    v
AgentMode (existing ReAct loop)
    |
    v  (LLM responds with <agent_call> or <tool_call> tags)
Dispatcher (fan-out via semaphore)
    |
    |-- FileAgent        |-- CoderAgent       |-- ShellAgent
    |-- GitAgent         |-- SearchAgent      |-- PlannerAgent
    |-- ReviewerAgent    |-- TesterAgent      |-- RefactorAgent
    |-- DiagnosticsAgent |-- FormatterAgent   |-- DepsAgent
    +-- CustomAgent(s)   (devops, security-auditor, etc.)
    |
    v
Results Aggregator -> Feedback to the orchestrator LLM
The orchestrator LLM receives an agent catalog in the system prompt and learns to route tasks using <agent_call> tags:
<agent_call agent="file" task="Read all .go files in pkg/coder/engine/" />
<agent_call agent="coder" task="Add Close method to Engine struct" />
<agent_call agent="devops" task="Configure CI/CD pipeline with GitHub Actions" />
Multiple agent_call tags in the same response result in parallel execution.

Two Execution Modes

The orchestrator has two execution mechanisms, choosing the most appropriate one by context:
ModeSyntaxWhen to Use
agent_call<agent_call agent="..." task="..." />New work phases, parallel tasks, exploratory reading, multi-file refactoring
tool_call<tool_call name="@coder" args="..." />Quick fixes, error diagnosis, point patches, post-agent validation. IMPORTANT: multiple independent tool_calls must be emitted in a SINGLE response

Decision Guide

SituationMode
Read multiple files + search referencesagent_call (file + search in parallel)
Fix a compilation errortool_call (direct patch)
Write a new module + testsagent_call (coder + shell)
Verify an agent’s resulttool_call (quick read/exec)
Fix after agent failuretool_call (precise diagnosis)
Resume after applied fixagent_call (next phase)

Embedded Specialist Agents

Access: Read-only (read, tree, search)Skills:
  • batch-readAccelerator script: reads N files in parallel goroutines without calling the LLM
  • find-pattern — Searches for patterns in files
  • analyze-structure — Analyzes code structure
  • map-deps — Maps dependencies between modules
Access: Read/Write (write, patch, read, tree)Skills:
  • write-file — Creating new files
  • patch-file — Precise modification of existing code
  • create-module — Boilerplate generation
  • refactor — Safe renaming and refactoring
Access: Execution (exec, test)Skills:
  • run-testsAccelerator script: runs go test ./... -json and parses results
  • build-checkAccelerator script: runs go build ./... && go vet ./...
  • lint-fix — Automatic lint correction
Access: Git ops (git-status, git-diff, git-log, git-changed, git-branch, exec)Skills:
  • smart-commitAccelerator script: collects status + diff for intelligent commit
  • review-changesAccelerator script: analyzes changes with changed + diff + log
  • create-branch — Branch creation
Access: None (no tools — pure LLM reasoning)Skills:
  • analyze-task — Complexity and risk analysis
  • create-plan — Execution plan creation
  • decompose — Complex task decomposition
Access: Read-only (read, search, tree)Skills:
  • review-file — Analyzes files for bugs, code smells, SOLID violations, and security issues
  • diff-reviewAccelerator script: reviews staged changes via git-diff and git-changed
  • scan-lintAccelerator script: runs go vet and staticcheck and categorizes issues
Access: Read/Write/Execute (read, write, patch, exec, test, search, tree)Skills:
  • generate-tests — Comprehensive test generation for functions and packages (LLM-driven)
  • run-coverageAccelerator script: runs go test -coverprofile and parses coverage per function
  • find-untestedAccelerator script: finds exported functions without corresponding tests
  • generate-table-test — Idiomatic Go table-driven test generation
Access: Read/Write (read, write, patch, search, tree)Skills:
  • rename-symbolAccelerator script: renames symbol across all .go files, ignoring strings and comments
  • extract-interface — Extracts interface from a concrete type’s methods
  • move-function — Moves function between packages adjusting imports
  • inline-variable — Replaces variable with its value at all usage points
Access: Read/Execute (read, search, tree, exec)Skills:
  • analyze-error — Parses error messages and stack traces mapping to code locations
  • check-depsAccelerator script: runs go mod tidy, go mod verify and checks dependency health
  • bisect-bug — Guides investigation to find the commit that introduced a bug
  • profile-bottleneck — Runs benchmarks or pprof and analyzes performance hotspots
Access: Write/Execute (read, patch, exec, tree)Skills:
  • format-codeAccelerator script: runs gofmt -w (or goimports -w) on Go files
  • fix-importsAccelerator script: runs goimports to organize imports
  • normalize-style — Applies consistent naming and style conventions (LLM-driven)
Access: Read/Execute (read, exec, search, tree)Skills:
  • audit-depsAccelerator script: runs go mod verify and govulncheck for auditing
  • update-depsAccelerator script: lists outdated dependencies with available updates (dry-run)
  • why-depAccelerator script: explains why a dependency exists via go mod why and go mod graph
  • find-outdated — Finds all dependencies with newer versions available

Custom Agents as Workers

Agent personas defined in ~/.chatcli/agents/ are automatically loaded as workers in the orchestration system when starting /coder or /agent. The LLM can dispatch them via <agent_call> with the same ReAct loop, parallel reading, and error recovery as the embedded agents.

How It Works

1

Scanning

When starting multi-agent mode, the system scans ~/.chatcli/agents/
2

CustomAgent Creation

For each agent found, creates a CustomAgent that implements the WorkerAgent interface
3

Tool Mapping

The tools field in the YAML frontmatter defines which commands the agent can use
4

Skills Loading

Associated skills are loaded and included in the worker’s system prompt
5

Catalog Registration

The agent appears in the orchestrator’s catalog and can be dispatched

Tool Mapping

The tools field in the YAML frontmatter maps Claude Code-style tools to @coder subcommands:
Tool in YAML@coder Command(s)Description
ReadreadRead file contents
GrepsearchSearch for patterns in files
GlobtreeList directories
Bashexec, test, git-status, git-diff, git-log, git-changed, git-branchExecution and git operations
WritewriteCreate/overwrite files
EditpatchPrecise editing (search/replace)

Custom Agent Example

---
name: "security-auditor"
description: "Security specialist focused on OWASP Top 10"
tools: Read, Grep, Glob
skills:
  - owasp-rules
  - compliance
---
# Base Personality

You are an expert Security Auditor. Analyze code looking for
OWASP Top 10 vulnerabilities, injection, XSS, and bad practices.
This agent will be read-only (only Read/Grep/Glob) and the LLM can dispatch it as follows:
<agent_call agent="security-auditor" task="Audit the authentication module for OWASP vulnerabilities" />

Protection Rules

The 12 embedded agent names (file, coder, shell, git, search, planner, reviewer, tester, refactor, diagnostics, formatter, deps) are protected and cannot be overwritten by custom agents.
  • No tools = read-only: Agents without a tools field automatically receive read, search, tree and are marked as read-only
  • Duplicates ignored: If two agents have the same name, only the first one is registered

Skills: Scripts vs Descriptive

Each agent has two types of skills:
Pre-defined command sequences that bypass the LLM for mechanical and repetitive operations, executing directly on the engine:
batch-read   -> Reads N files in parallel goroutines (no LLM call)
run-tests    -> go test ./... -json | automatic parsing
build-check  -> go build ./... && go vet ./...
smart-commit -> git status + git diff --cached -> summary
map-project  -> tree + search interfaces/structs in parallel

Skills V2 (Packages)

Skills V2 are directories containing:
  • SKILL.md — Main content with frontmatter
  • Subskills (.md) — Additional knowledge documents
  • scripts/ — Executable scripts automatically registered in the worker
skills/
+-- clean-code/
    |-- SKILL.md            # Main content
    |-- naming-rules.md     # Subskill: naming rules
    |-- formatting.md       # Subskill: formatting rules
    +-- scripts/
        +-- lint_check.py   # Executable script (registered as skill)
The worker can read subskills with the read command and execute scripts with exec during its autonomous operation.

Error Recovery Strategy

When an agent_call fails, the orchestrator follows an intelligent recovery protocol:
1

Diagnosis via tool_call

Uses direct tool_call to read relevant files and understand the error (already has the context)
2

Fix via tool_call

Patches, file corrections, and retries are faster and safer via tool_call
3

Resume via agent_call

After the fix is applied and verified, resumes using agent_call for the next phase
Key rule: Error recovery = tool_call (fast, precise). New work phases = agent_call (parallel, scalable).
agent_call -> FAILURE
    |
    v
tool_call: read (diagnose the error)
    |
    v
tool_call: patch (apply fix)
    |
    v
tool_call: exec (verify fix)
    |
    v
agent_call -> NEXT PHASE (success)

Configuration

VariableDefaultDescription
CHATCLI_AGENT_PARALLEL_MODEtrueEnables/disables multi-agent mode
CHATCLI_AGENT_MAX_WORKERS4Maximum simultaneous goroutines
CHATCLI_AGENT_WORKER_MAX_TURNS10Maximum turns per worker
CHATCLI_AGENT_WORKER_TIMEOUT5mTimeout per worker

.env Example

# Multi-Agent (Parallel Orchestration)
CHATCLI_AGENT_PARALLEL_MODE=true    # Disable with false if needed
CHATCLI_AGENT_MAX_WORKERS=4
CHATCLI_AGENT_WORKER_MAX_TURNS=10
CHATCLI_AGENT_WORKER_TIMEOUT=5m

Anti-Race Safety

The system implements multiple layers of protection against race conditions:

FileLockManager

Per-filepath mutex (normalized absolute paths). Write operations acquire locks; reads do not block.

Isolated History

Each worker maintains its own []models.Message, with no sharing.

Independent LLM Clients

Each worker creates its own LLM client instance via factory pattern.

Stateless Engine

Each worker instantiates its own fresh engine.Engine.

Context Tree

The parent context can cancel all workers via context.WithCancel.

Policy Enforcement

Workers fully respect the coder_policy.json (allow/deny/ask). Actions with an “ask” policy pause the spinner and display a serialized security prompt to the user.

Security Governance in Parallel Mode

Parallel workers respect all rules from the coder_policy.json file (global and local). This means that actions like write, patch, exec go through the same policy verification as sequential mode.

Behavior by Rule Type

RuleBehavior in Worker
allowAction executed automatically, without interruption
denyAction silently blocked; worker receives [BLOCKED BY POLICY] error
askWorker pauses, spinner is suspended, and a security prompt is displayed to the user

Prompt Serialization

When multiple workers need approval simultaneously, prompts are serialized via mutex — only one prompt is displayed at a time. After the user responds, the next worker in the queue receives its prompt. This prevents:
  • Visual overlap of prompts in the terminal
  • stdin reading conflicts
  • Spinner rendering over the security prompt

Prompt with Agent Context

The security prompt in parallel mode displays contextual information about which agent is requesting the action:
+========================================================+
|              SECURITY CHECK                             |
+========================================================+
 Agent:  coder
 Task:   Refactor authentication module
 --------------------------------------------------------
 Action: Write file
         file: pkg/auth/handler.go
 Rule:   no rule for '@coder write'
 --------------------------------------------------------
 Choice:
   [y] Yes, execute (once)
   [a] Always allow (@coder write)
   [n] No, skip
   [d] Always block (@coder write)
This allows the user to make informed decisions about each action, knowing exactly which agent is asking and why.

Runtime Provider/Model Respect

Parallel workers always use the active provider and model at the time of dispatch. If the user switches providers (e.g., from Anthropic to Google AI) via /switch, the next agent dispatches will use the new provider correctly.

Execution Flow (Example)

1

User sends the query

“refactor the coder module, separate read and write”
2

Orchestrator LLM dispatches parallel agents

<agent_call agent="file" task="Read all .go files in pkg/coder/engine/" />
<agent_call agent="search" task="Find references to handleRead and handleWrite" />
3

Dispatcher creates goroutines

FileAgent and SearchAgent run in parallel, each with their own LLM client and isolated mini ReAct loop (within the maxWorkers limit).
4

Results aggregated

Feedback is sent to the orchestrator.
5

Orchestrator dispatches CoderAgent

For the refactoring (with FileLock on the files being written).
6

Dispatches ShellAgent for tests

Runs tests after writing.
7

Error recovery (if needed)

If tests fail, uses tool_call for diagnosis and quick fix.
8

Final validation

Orchestrator validates the final result and reports to the user.

Parallelism Maximization

ChatCLI’s prompt system explicitly instructs the AI to maximize parallelism at all levels:
  1. tool_call: Independent operations (read 3 files, search + read) must be emitted in a SINGLE response, not in separate turns
  2. agent_call: For 3+ independent tasks, prefer agent_call which runs in parallel goroutines
  3. Per-turn anchor: Every ReAct loop turn, a reminder reinforces the need for parallelism
Correct example (3 reads in ONE response):
<tool_call name="@coder" args='{"cmd":"read","args":{"file":"main.go"}}' />
<tool_call name="@coder" args='{"cmd":"read","args":{"file":"config.go"}}' />
<tool_call name="@coder" args='{"cmd":"read","args":{"file":"handler.go"}}' />
Incorrect example (3 turns for independent operations):
Turn 1: read main.go → wait
Turn 2: read config.go → wait
Turn 3: read handler.go → wait

Compatibility

  • CHATCLI_AGENT_PARALLEL_MODE=false: everything works exactly as before
  • <tool_call> tags continue to work even with parallel mode active
  • No existing function signatures were changed
  • The cli/agent/workers/ package is completely isolated and does not impact existing functionality