#2 Plan-and-Solve / ReWOO

Plan-and-Solve synthesizes a structured plan before the orchestrator starts dispatching. ReWOO (Reasoning Without Observation) extends it: the plan uses #E1, #E2 placeholders resolved at runtime with the outputs of previous steps. Combined, they avoid the expensive pattern of “dispatch an agent, wait, think again, dispatch another”.

When it fires: /plan <task> forces it; CHATCLI_QUALITY_PLAN_FIRST_MODE=always always; auto (default) triggers when ComplexityScore(task) >= 6. To review the plan before executing, use /plan preview <task> (dry-run).

Plan format

The PlannerAgent, when it receives the PlannerStructuredOutputDirective marker at the start of the task, emits strict JSON:

{
  "task_summary": "Add OAuth login with Google",
  "steps": [
    {"id": "E1", "agent": "search", "task": "Find existing auth code in cli/auth"},
    {"id": "E2", "agent": "planner", "task": "Design integration plan based on #E1", "deps": ["E1"]},
    {"id": "E3", "agent": "coder",   "task": "Implement based on #E2", "deps": ["E2"]},
    {"id": "E4", "agent": "tester",  "task": "Write tests for #E3", "deps": ["E3"]}
  ],
  "parallel_groups": [["E1"], ["E2"], ["E3", "E4"]]
}

ReWOO placeholders

Any step can inject the output of earlier steps via #E<n>:

Syntax	Effect
`#E1`	Replaces with the trimmed full output
`#E1.summary`	First non-empty line of the output
`#E1.head=200`	First 200 runes with `…` if truncated
`#E1.last=200`	Last 200 runes with leading `…`

Use .head=N for large tasks (e.g. “given this log #E1.head=500, diagnose”) to bound the context the next step sees.

Complexity heuristic

ComplexityScore(task) → int[0,10] is calibrated to fire Plan-First only when it pays off. The score blends three signals:

Signal	Cap	What counts
Action verbs	5	`implement, add, create, fix, refactor, write, test, deploy, build, run, update, …` — bilingual (en + pt-BR: `implementar, adicionar, criar, corrigir, escrever, …`)
File artefacts	3	Matches on `\b[\w./-]+\.(go\|ts\|py\|md\|yaml\|…)\b` + `Dockerfile\|Makefile\|…`
Sequencers	2	`then, after, finally, e depois, e em seguida, por fim, após, …`

Examples

Trivial (score 1)
Multi-action (score 6+)
PT-BR (score 6+)

read main.go

auto does not fire Plan-First. Orchestrator handles directly.

update auth.go and add tests/auth_test.go then run go test

auto fires Plan-First. 3 verbs (update, add, run) + 2 files + 1 sequencer = score 6.

criar handler em api.go e adicionar testes em api_test.go depois rodar go test

Identical to the previous one — bilingual by design.

Execution flow

User fires /agent or /coder

The task goes into AgentMode.Run().

runPlanFirstIfApplicable

Checks cli.pendingPlanFirst (one-shot) or quality.ShouldPlanFirst(cfg, userQuery).

Planner dispatch

agentDispatcher.Dispatch([{Agent: planner, Task: PlannerStructuredOutputDirective + userQuery}]).

ParsePlan

Tolerant to markdown code fences and trailing prose. Validates: unique IDs, deps point to earlier steps, #E<n> placeholders point to declared IDs.

TopologicalOrder + Execute

Stable topological order (lex sort on ties) guarantees reproducibility. Each step: resolve placeholders → dispatcher.Dispatch → store output.

Inject report

Two synthetic messages go into cli.history:

assistant: the plan JSON (the model sees what was attempted)
user: deterministic FinalReport with task/agent/status/output per step + handoff

The second message is role=user (not system) since the April 2026 fix. Models like Claude Sonnet 4.6 refuse completion when the conversation ends on assistant (“This model does not support assistant message prefill”). The synthetic user turn closes the conversation correctly and gives the orchestrator an explicit anchor to finalize.

ReAct loop continues (or is skipped, if dry-run)

With the report in history, the orchestrator finalizes without re-executing completed steps. In dry-run mode (/plan preview), the loop is skipped via planDryRunHandled — no orchestrator call is made.

Fault tolerance

A failing step does not abort the run. The behavior is “continue downstream with substituted error”:

// cli/agent/quality/plan_runner.go:126
if res.Error != nil {
    hadErrors = true
    outputs[id] = fmt.Sprintf("<error: %s>", res.Error.Error())
    // continues — downstream steps will see the "<error: …>" string substituted in their placeholders
}

This mirrors how the orchestrator already reacts to per-agent errors today (continues, summarizes, lets the model decide). The HadErrors flag is set so Reflexion can decide to escalate.

`/plan` — manual invocation

/plan accepts six forms. Autocomplete (Tab) offers the subcommands.

/plan
# → "plan-first armed — your next /agent or /coder turn will run a structured plan"
/agent refactor auth package to use OAuth
# → runs Plan-First even if complexity score < threshold

# Also works with /coder — the flag is consumed on the first subsequent invocation

Behavior matrix

Form	Enters	Executes steps?	Calls orchestrator?	When to use
`/plan`	— (arms flag)	depends on next `/agent` or `/coder`	yes	Want manual control of the consumer mode
`/plan <task>`	agent	yes	yes	Default flow, fastest
`/plan agent <task>`	agent	yes	yes	Explicit equivalent (symmetry with `coder`)
`/plan coder <task>`	coder	yes	yes	Engineering tasks (edits, builds, tests)
`/plan preview <task>`	agent (temporary)	no	no	Review plan before running
`/plan dry <task>`	same as `preview`	no	no	Alias

The cli.pendingPlanFirst flag is consumed and cleared on the first subsequent /agent or /coder invocation. The cli.pendingPlanDryRun flag (exclusive to preview/dry modes) is cleared at the same point and makes AgentMode.Run return before the ReAct loop via planDryRunHandled.

Recommended flow for large changes: /plan preview <task> first → review the JSON → if approved, run /plan coder <same task>.

Environment variables

Env var	Default	Values	Effect
`CHATCLI_QUALITY_PLAN_FIRST_MODE`	`auto`	`off\|auto\|always`	Trigger mode
`CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD`	`6`	0..10	Minimum score for `auto` to fire

Override via persona (CHATCLI_AGENT_PLANNER_*)

PlannerAgent respects the usual per-agent overrides:

# Force planner to use Opus with max thinking
export CHATCLI_AGENT_PLANNER_MODEL="claude-opus-4-8"
export CHATCLI_AGENT_PLANNER_EFFORT="max"

Observability

Each run emits structured logs:

{"level":"info","msg":"Plan-First triggered","forced":false,"mode":"auto","complexity":7}
{"level":"info","msg":"plan step dispatching","id":"E1","agent":"search","task":"Find existing..."}
{"level":"info","msg":"plan step dispatching","id":"E2","agent":"planner","task":"Design based on found auth code"}
{"level":"warn","msg":"plan step failed","id":"E3","agent":"coder","error":"timeout"}

And prints a friendly one-liner to the terminal:

  plan-first executed 4 step(s) before handing off to the orchestrator

Spinner during Plan-First

The “planning structured steps…” spinner shows only during the pure PlannerAgent call (step 1). During PlanRunner execution (step 2), the spinner is intentionally off and a static status line is emitted:

  executing structured plan...

This is critical for safety: steps can trigger interactive approvals (ShellAgent exec, CoderAgent write) via the policy engine. If the spinner stayed active during execution, its \r\033[K repaint would overwrite the approval prompt and block the response. Ctrl+C still cancels normally — the cancelled context propagates through dispatcher and PlanRunner between steps.

Dry-run: what gets rendered

/plan preview <task> renders:

  Generated plan (dry-run — nothing will be executed)

  Summary: <task_summary from JSON>

  1. [E1] agent=search
     <resolved task>
  2. [E2] agent=planner
     <task>
     deps: E1
  3. [E3] agent=coder
     <task>
     deps: E2

  Parallel groups:
     [E1]
     [E2]
     [E3]

  To execute this plan: /plan <same task> (or /plan coder <task>)

If ParsePlan fails (planner returned malformed JSON or prose), the raw output is printed with a warning — never silent.

When to turn off

Plan-First adds +1 LLM call (the planner) per fired turn. In budget-tight environments, off or threshold=8 save money.

# off: never fires, even with /plan
export CHATCLI_QUALITY_PLAN_FIRST_MODE=off

# High threshold: only fires on very complex tasks
export CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD=9

#3 Reflexion

Reflexion consumes HadErrors from the plan runner to generate lessons when steps fail.

Multi-Agent Orchestration

The dispatcher that PlanRunner reuses is the same as the standard orchestrator.

PlannerAgent (pre-PR)

The agent existed before the pipeline — this pattern formalizes how to invoke it deterministically.

Full configuration

All env vars and slashes in one place.

​Plan format

​ReWOO placeholders

​Complexity heuristic

​Examples

​Execution flow

​Fault tolerance

​/plan — manual invocation

​Behavior matrix

​Environment variables

​Override via persona (CHATCLI_AGENT_PLANNER_*)

​Observability

​Spinner during Plan-First

​Dry-run: what gets rendered

​When to turn off

​See also

#3 Reflexion

Multi-Agent Orchestration

PlannerAgent (pre-PR)

Full configuration

Plan format

ReWOO placeholders

Complexity heuristic

Examples

Execution flow

Fault tolerance

`/plan` — manual invocation

Behavior matrix

Environment variables

Override via persona (CHATCLI_AGENT_PLANNER_*)

Observability

Spinner during Plan-First

Dry-run: what gets rendered

When to turn off

See also