/config, and composes with the others without regressing steady-state performance.
Design premise: opt-in by default. With
CHATCLI_QUALITY_* unset, the pipeline runs with zero post-hooks — Pipeline.Run degenerates to a direct agent.Execute call. You only pay for what you enable.The seven patterns
#1 — ReAct
Reason → Act → Observe. The base loop every worker runs. Already present; now emits structured events and auto-attaches effort hints.
#2 — Plan-and-Solve / ReWOO
PlannerAgent emits structured JSON; PlanRunner executes steps in topological order with #E1.head=200 placeholders.#3 — Reflexion
Detects error, hallucination or low quality; distills a Lesson via LLM and persists in
memory.Fact for future RAG retrieval.#4 — RAG + HyDE
Hypothesis-based keyword expansion (3a) + cosine vector search (3b — Voyage/OpenAI, pure-Go backend).
#5 — Self-Refine
RefinerAgent critiques the draft and rewrites. Multi-pass with convergence via EpsilonChars.#6 — Chain-of-Verification
VerifierAgent generates independent verification questions, answers each, and rewrites on discrepancy.#7 — Reasoning Backbone
Cross-provider abstraction:
thinking_budget on Anthropic, reasoning_effort on OpenAI. Auto-attach for critical agents.Configuration
CHATCLI_QUALITY_* env vars, /config quality, and the five slash commands: /thinking, /plan, /refine, /verify, /reflect.How the patterns connect
Trigger matrix
| Pattern | Slash | Env var | Default | Auto trigger |
|---|---|---|---|---|
| #1 ReAct | — | — | always on | always |
| #2 Plan-First | /plan [task] | CHATCLI_QUALITY_PLAN_FIRST_MODE | auto | complexity ≥ 6 |
| #3 Reflexion | /reflect <lesson> | CHATCLI_QUALITY_REFLEXION_ENABLED | on | error, CoVe flagged, refine low |
| #4 HyDE | — (transparent) | CHATCLI_QUALITY_HYDE_ENABLED | off | every retrieval |
| #5 Refine | /refine on|off | CHATCLI_QUALITY_REFINE_ENABLED | off | post-worker |
| #6 CoVe | /verify on|off | CHATCLI_QUALITY_VERIFY_ENABLED | off | post-worker |
| #7 Reasoning | /thinking on|off | CHATCLI_QUALITY_REASONING_MODE | auto | for AutoAgents |
Override priority
For a given turn, the effort hint resolves in this order (later wins):
For Refine / Verify / Reflexion hook enablement:
For Plan-First:
Cost and latency
Defaults are calibrated for steady-state identical to pre-pipeline chatcli. Expensive patterns (Refine, Verify, HyDE) start off; you opt in when the context justifies.
| Pattern | Extra LLM calls per turn | Notes |
|---|---|---|
| ReAct | 0 (already part of the loop) | — |
| Plan-First (auto) | +1 (planner) when triggered | Steps reuse the dispatcher |
| Reflexion | +1 (lesson gen), background | Never blocks the turn |
| HyDE 3a | +1 (hypothesis), cheap | 200 token budget |
| HyDE 3b | +1 (query embed) + lazy backfill | embedding API ~$0.00002/1k tokens |
| Self-Refine | +N (one per pass, default 1) | Convergence cuts it short |
| CoVe | +1 (verifier) per call site | Internally N=3 questions |
| Reasoning auto | 0 extra calls; +tokens on hosted thinking | Anthropic budget = 8k default |
Observability
Every active pattern shows up in/config quality:
Next steps
Tutorial: Plan-and-Solve
Start with the pattern with highest leverage on multi-step tasks.
Configure HyDE with vectors
Enable embeddings (Voyage or OpenAI) for semantic retrieval.
Slash reference
/thinking, /plan, /refine, /verify, /reflect.Full env var list
All
CHATCLI_QUALITY_* and CHATCLI_EMBED_*.