Skip to main content
Plan-and-Solve synthesizes a structured plan before the orchestrator starts dispatching. ReWOO (Reasoning Without Observation) extends it: the plan uses #E1, #E2 placeholders resolved at runtime with the outputs of previous steps. Combined, they avoid the expensive pattern of “dispatch an agent, wait, think again, dispatch another”.
When it fires: /plan <task> forces it; CHATCLI_QUALITY_PLAN_FIRST_MODE=always always; auto (default) triggers when ComplexityScore(task) >= 6.

Plan format

The PlannerAgent, when it receives the PlannerStructuredOutputDirective marker at the start of the task, emits strict JSON:
{
  "task_summary": "Add OAuth login with Google",
  "steps": [
    {"id": "E1", "agent": "search", "task": "Find existing auth code in cli/auth"},
    {"id": "E2", "agent": "planner", "task": "Design integration plan based on #E1", "deps": ["E1"]},
    {"id": "E3", "agent": "coder",   "task": "Implement based on #E2", "deps": ["E2"]},
    {"id": "E4", "agent": "tester",  "task": "Write tests for #E3", "deps": ["E3"]}
  ],
  "parallel_groups": [["E1"], ["E2"], ["E3", "E4"]]
}

ReWOO placeholders

Any step can inject the output of earlier steps via #E<n>:
SyntaxEffect
#E1Replaces with the trimmed full output
#E1.summaryFirst non-empty line of the output
#E1.head=200First 200 runes with if truncated
#E1.last=200Last 200 runes with leading
Use .head=N for large tasks (e.g. “given this log #E1.head=500, diagnose”) to bound the context the next step sees.

Complexity heuristic

ComplexityScore(task) → int[0,10] is calibrated to fire Plan-First only when it pays off. The score blends three signals:
SignalCapWhat counts
Action verbs5implement, add, create, fix, refactor, write, test, deploy, build, run, update, … — bilingual (en + pt-BR: implementar, adicionar, criar, corrigir, escrever, …)
File artefacts3Matches on \b[\w./-]+\.(go|ts|py|md|yaml|…)\b + Dockerfile|Makefile|…
Sequencers2then, after, finally, e depois, e em seguida, por fim, após, …

Examples

read main.go
auto does not fire Plan-First. Orchestrator handles directly.

Execution flow

1

User fires /agent or /coder

The task goes into AgentMode.Run().
2

runPlanFirstIfApplicable

Checks cli.pendingPlanFirst (one-shot) or quality.ShouldPlanFirst(cfg, userQuery).
3

Planner dispatch

agentDispatcher.Dispatch([{Agent: planner, Task: PlannerStructuredOutputDirective + userQuery}]).
4

ParsePlan

Tolerant to markdown code fences and trailing prose. Validates: unique IDs, deps point to earlier steps, #E<n> placeholders point to declared IDs.
5

TopologicalOrder + Execute

Stable topological order (lex sort on ties) guarantees reproducibility. Each step: resolve placeholders → dispatcher.Dispatch → store output.
6

Inject report

Two synthetic messages go into cli.history:
  • assistant: the plan JSON (the model sees what was attempted)
  • system: deterministic FinalReport with task/agent/status/output per step
7

ReAct loop continues

With the report in history, the orchestrator finalizes without re-executing completed steps.

Fault tolerance

A failing step does not abort the run. The behavior is “continue downstream with substituted error”:
// cli/agent/quality/plan_runner.go:126
if res.Error != nil {
    hadErrors = true
    outputs[id] = fmt.Sprintf("<error: %s>", res.Error.Error())
    // continues — downstream steps will see the "<error: …>" string substituted in their placeholders
}
This mirrors how the orchestrator already reacts to per-agent errors today (continues, summarizes, lets the model decide). The HadErrors flag is set so Reflexion can decide to escalate.

/plan — manual invocation

/plan
# → "plan-first armed — your next /agent or /coder turn will run a structured plan"
/agent refactor auth package to use OAuth
# → runs Plan-First even if complexity score < threshold
The cli.pendingPlanFirst flag is consumed and cleared on the first subsequent /agent or /coder invocation.

Environment variables

Env varDefaultValuesEffect
CHATCLI_QUALITY_PLAN_FIRST_MODEautooff|auto|alwaysTrigger mode
CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD60..10Minimum score for auto to fire

Override via persona (CHATCLI_AGENT_PLANNER_*)

PlannerAgent respects the usual per-agent overrides:
# Force planner to use Opus with max thinking
export CHATCLI_AGENT_PLANNER_MODEL="claude-opus-4-7"
export CHATCLI_AGENT_PLANNER_EFFORT="max"

Observability

Each run emits structured logs:
{"level":"info","msg":"Plan-First triggered","forced":false,"mode":"auto","complexity":7}
{"level":"info","msg":"plan step dispatching","id":"E1","agent":"search","task":"Find existing..."}
{"level":"info","msg":"plan step dispatching","id":"E2","agent":"planner","task":"Design based on found auth code"}
{"level":"warn","msg":"plan step failed","id":"E3","agent":"coder","error":"timeout"}
And prints a friendly one-liner to the terminal:
  plan-first executed 4 step(s) before handing off to the orchestrator

When to turn off

Plan-First adds +1 LLM call (the planner) per fired turn. In budget-tight environments, off or threshold=8 save money.
# off: never fires, even with /plan
export CHATCLI_QUALITY_PLAN_FIRST_MODE=off

# High threshold: only fires on very complex tasks
export CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD=9

See also

#3 Reflexion

Reflexion consumes HadErrors from the plan runner to generate lessons when steps fail.

Multi-Agent Orchestration

The dispatcher that PlanRunner reuses is the same as the standard orchestrator.

PlannerAgent (pre-PR)

The agent existed before the pipeline — this pattern formalizes how to invoke it deterministically.

Full configuration

All env vars and slashes in one place.