Skip to main content
Plan-and-Solve synthesizes a structured plan before the orchestrator starts dispatching. ReWOO (Reasoning Without Observation) extends it: the plan uses #E1, #E2 placeholders resolved at runtime with the outputs of previous steps. Combined, they avoid the expensive pattern of “dispatch an agent, wait, think again, dispatch another”.
When it fires: /plan <task> forces it; CHATCLI_QUALITY_PLAN_FIRST_MODE=always always; auto (default) triggers when ComplexityScore(task) >= 6. To review the plan before executing, use /plan preview <task> (dry-run).

Plan format

The PlannerAgent, when it receives the PlannerStructuredOutputDirective marker at the start of the task, emits strict JSON:
{
  "task_summary": "Add OAuth login with Google",
  "steps": [
    {"id": "E1", "agent": "search", "task": "Find existing auth code in cli/auth"},
    {"id": "E2", "agent": "planner", "task": "Design integration plan based on #E1", "deps": ["E1"]},
    {"id": "E3", "agent": "coder",   "task": "Implement based on #E2", "deps": ["E2"]},
    {"id": "E4", "agent": "tester",  "task": "Write tests for #E3", "deps": ["E3"]}
  ],
  "parallel_groups": [["E1"], ["E2"], ["E3", "E4"]]
}

ReWOO placeholders

Any step can inject the output of earlier steps via #E<n>:
SyntaxEffect
#E1Replaces with the trimmed full output
#E1.summaryFirst non-empty line of the output
#E1.head=200First 200 runes with if truncated
#E1.last=200Last 200 runes with leading
Use .head=N for large tasks (e.g. “given this log #E1.head=500, diagnose”) to bound the context the next step sees.

Complexity heuristic

ComplexityScore(task) → int[0,10] is calibrated to fire Plan-First only when it pays off. The score blends three signals:
SignalCapWhat counts
Action verbs5implement, add, create, fix, refactor, write, test, deploy, build, run, update, … — bilingual (en + pt-BR: implementar, adicionar, criar, corrigir, escrever, …)
File artefacts3Matches on \b[\w./-]+\.(go|ts|py|md|yaml|…)\b + Dockerfile|Makefile|…
Sequencers2then, after, finally, e depois, e em seguida, por fim, após, …

Examples

read main.go
auto does not fire Plan-First. Orchestrator handles directly.

Execution flow

1

User fires /agent or /coder

The task goes into AgentMode.Run().
2

runPlanFirstIfApplicable

Checks cli.pendingPlanFirst (one-shot) or quality.ShouldPlanFirst(cfg, userQuery).
3

Planner dispatch

agentDispatcher.Dispatch([{Agent: planner, Task: PlannerStructuredOutputDirective + userQuery}]).
4

ParsePlan

Tolerant to markdown code fences and trailing prose. Validates: unique IDs, deps point to earlier steps, #E<n> placeholders point to declared IDs.
5

TopologicalOrder + Execute

Stable topological order (lex sort on ties) guarantees reproducibility. Each step: resolve placeholders → dispatcher.Dispatch → store output.
6

Inject report

Two synthetic messages go into cli.history:
  • assistant: the plan JSON (the model sees what was attempted)
  • user: deterministic FinalReport with task/agent/status/output per step + handoff
The second message is role=user (not system) since the April 2026 fix. Models like Claude Sonnet 4.6 refuse completion when the conversation ends on assistant (“This model does not support assistant message prefill”). The synthetic user turn closes the conversation correctly and gives the orchestrator an explicit anchor to finalize.
7

ReAct loop continues (or is skipped, if dry-run)

With the report in history, the orchestrator finalizes without re-executing completed steps. In dry-run mode (/plan preview), the loop is skipped via planDryRunHandled — no orchestrator call is made.

Fault tolerance

A failing step does not abort the run. The behavior is “continue downstream with substituted error”:
// cli/agent/quality/plan_runner.go:126
if res.Error != nil {
    hadErrors = true
    outputs[id] = fmt.Sprintf("<error: %s>", res.Error.Error())
    // continues — downstream steps will see the "<error: …>" string substituted in their placeholders
}
This mirrors how the orchestrator already reacts to per-agent errors today (continues, summarizes, lets the model decide). The HadErrors flag is set so Reflexion can decide to escalate.

/plan — manual invocation

/plan accepts six forms. Autocomplete (Tab) offers the subcommands.
/plan
# → "plan-first armed — your next /agent or /coder turn will run a structured plan"
/agent refactor auth package to use OAuth
# → runs Plan-First even if complexity score < threshold

# Also works with /coder — the flag is consumed on the first subsequent invocation

Behavior matrix

FormEntersExecutes steps?Calls orchestrator?When to use
/plan— (arms flag)depends on next /agent or /coderyesWant manual control of the consumer mode
/plan <task>agentyesyesDefault flow, fastest
/plan agent <task>agentyesyesExplicit equivalent (symmetry with coder)
/plan coder <task>coderyesyesEngineering tasks (edits, builds, tests)
/plan preview <task>agent (temporary)nonoReview plan before running
/plan dry <task>same as previewnonoAlias
The cli.pendingPlanFirst flag is consumed and cleared on the first subsequent /agent or /coder invocation. The cli.pendingPlanDryRun flag (exclusive to preview/dry modes) is cleared at the same point and makes AgentMode.Run return before the ReAct loop via planDryRunHandled.
Recommended flow for large changes: /plan preview <task> first → review the JSON → if approved, run /plan coder <same task>.

Environment variables

Env varDefaultValuesEffect
CHATCLI_QUALITY_PLAN_FIRST_MODEautooff|auto|alwaysTrigger mode
CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD60..10Minimum score for auto to fire

Override via persona (CHATCLI_AGENT_PLANNER_*)

PlannerAgent respects the usual per-agent overrides:
# Force planner to use Opus with max thinking
export CHATCLI_AGENT_PLANNER_MODEL="claude-opus-4-8"
export CHATCLI_AGENT_PLANNER_EFFORT="max"

Observability

Each run emits structured logs:
{"level":"info","msg":"Plan-First triggered","forced":false,"mode":"auto","complexity":7}
{"level":"info","msg":"plan step dispatching","id":"E1","agent":"search","task":"Find existing..."}
{"level":"info","msg":"plan step dispatching","id":"E2","agent":"planner","task":"Design based on found auth code"}
{"level":"warn","msg":"plan step failed","id":"E3","agent":"coder","error":"timeout"}
And prints a friendly one-liner to the terminal:
  plan-first executed 4 step(s) before handing off to the orchestrator

Spinner during Plan-First

The “planning structured steps…” spinner shows only during the pure PlannerAgent call (step 1). During PlanRunner execution (step 2), the spinner is intentionally off and a static status line is emitted:
  executing structured plan...
This is critical for safety: steps can trigger interactive approvals (ShellAgent exec, CoderAgent write) via the policy engine. If the spinner stayed active during execution, its \r\033[K repaint would overwrite the approval prompt and block the response. Ctrl+C still cancels normally — the cancelled context propagates through dispatcher and PlanRunner between steps.

Dry-run: what gets rendered

/plan preview <task> renders:
  Generated plan (dry-run — nothing will be executed)

  Summary: <task_summary from JSON>

  1. [E1] agent=search
     <resolved task>
  2. [E2] agent=planner
     <task>
     deps: E1
  3. [E3] agent=coder
     <task>
     deps: E2

  Parallel groups:
     [E1]
     [E2]
     [E3]

  To execute this plan: /plan <same task> (or /plan coder <task>)
If ParsePlan fails (planner returned malformed JSON or prose), the raw output is printed with a warning — never silent.

When to turn off

Plan-First adds +1 LLM call (the planner) per fired turn. In budget-tight environments, off or threshold=8 save money.
# off: never fires, even with /plan
export CHATCLI_QUALITY_PLAN_FIRST_MODE=off

# High threshold: only fires on very complex tasks
export CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD=9

See also

#3 Reflexion

Reflexion consumes HadErrors from the plan runner to generate lessons when steps fail.

Multi-Agent Orchestration

The dispatcher that PlanRunner reuses is the same as the standard orchestrator.

PlannerAgent (pre-PR)

The agent existed before the pipeline — this pattern formalizes how to invoke it deterministically.

Full configuration

All env vars and slashes in one place.