Configuration — Env, Slashes and /config

All the quality-pipeline configuration in one place. Three channels: env vars (persistent), slashes (session), /config quality (inspection).

`/config quality`

Single command that lists the full pipeline state:

/config quality

Typical output:

✨ AGENT HARNESS/QUALITY PIPELINE ───────────────────────────
  CHATCLI_QUALITY_ENABLED              : enabled
  Hooks registered                    : pre=0, post=2

  ── Self-Refine (#5)
  CHATCLI_QUALITY_REFINE_ENABLED       : disabled
  CHATCLI_QUALITY_REFINE_MAX_PASSES    : 1
  CHATCLI_QUALITY_REFINE_MIN_BYTES     : 200
  CHATCLI_QUALITY_REFINE_EPSILON       : 50
  CHATCLI_QUALITY_REFINE_EXCLUDE       : formatter, deps, refiner, verifier
  CHATCLI_QUALITY_REFINE_CONVERGENCE_ENABLED   : enabled
  CHATCLI_QUALITY_REFINE_CONVERGENCE_EMBEDDING : disabled
  CHATCLI_QUALITY_REFINE_CONVERGENCE_STRICT    : disabled
  CHATCLI_QUALITY_REFINE_CONVERGENCE_CHAR_HIGH : 0.99
  CHATCLI_QUALITY_REFINE_CONVERGENCE_JACCARD_HIGH: 0.95
  CHATCLI_QUALITY_REFINE_CONVERGENCE_EMBEDDING_SIM: 0.92

  ── Chain-of-Verification / CoVe (#6)
  CHATCLI_QUALITY_VERIFY_ENABLED       : disabled
  CHATCLI_QUALITY_VERIFY_NUM_QUESTIONS : 3
  CHATCLI_QUALITY_VERIFY_REWRITE       : enabled
  CHATCLI_QUALITY_VERIFY_EXCLUDE       : formatter, deps, shell, refiner, verifier

  ── Reflexion (#3)
  CHATCLI_QUALITY_REFLEXION_ENABLED       : enabled
  CHATCLI_QUALITY_REFLEXION_ON_ERROR      : enabled
  CHATCLI_QUALITY_REFLEXION_ON_HALLUCINATION: enabled
  CHATCLI_QUALITY_REFLEXION_ON_LOW_QUALITY : disabled
  CHATCLI_QUALITY_REFLEXION_PERSIST       : enabled
  CHATCLI_QUALITY_REFLEXION_QUEUE_ENABLED : enabled
  CHATCLI_QUALITY_REFLEXION_QUEUE_WORKERS : 2
  CHATCLI_QUALITY_REFLEXION_QUEUE_CAPACITY: 1000
  CHATCLI_QUALITY_REFLEXION_QUEUE_MAX_ATTEMPTS: 5
  CHATCLI_QUALITY_REFLEXION_QUEUE_STALE_AFTER : 168h0m0s
  Runtime state                         : queue=0 dlq=0

  ── Plan-and-Solve / ReWOO (#2)
  CHATCLI_QUALITY_PLAN_FIRST_MODE      : auto
  CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD : 6

  ── RAG + HyDE (#4)
  CHATCLI_QUALITY_HYDE_ENABLED         : disabled
  CHATCLI_QUALITY_HYDE_USE_VECTORS     : disabled
  CHATCLI_EMBED_PROVIDER               : (not set)
  CHATCLI_EMBED_MODEL                  : (not set)
  CHATCLI_QUALITY_HYDE_NUM_KEYWORDS    : 5
  Vector index                        : (not attached)

  ── Reasoning backbone (#7)
  CHATCLI_QUALITY_REASONING_MODE       : auto
  CHATCLI_QUALITY_REASONING_BUDGET     : 8000
  CHATCLI_QUALITY_REASONING_AUTO_AGENTS: planner, refiner, verifier, reflexion

Slash commands

`/thinking` — reasoning override

/thinking                    # show current state
/thinking auto               # clear override
/thinking off                # force no-thinking for next turn
/thinking on                 # alias for /thinking high
/thinking low|medium|high|max
/thinking budget=N           # nearest tier to N tokens

Details: #7 Reasoning Backbone.

`/plan` — force Plan-and-Solve

/plan                               # arm flag; next /agent or /coder uses Plan-First
/plan <free task>                   # arm + enter agent mode and execute
/plan agent <task>                  # explicit equivalent of the previous form
/plan coder <task>                  # enter coder mode (software engineer) and execute
/plan preview <task>                # dry-run: generate and render the plan WITHOUT executing
/plan dry <task>                    # alias of preview

Recommended flow for large changes: /plan preview <task> → review the plan → /plan coder <same task> to execute.

Details: #2 Plan-and-Solve.

`/refine` — Self-Refine session toggle

/refine                  # current state
/refine on               # enable for session
/refine off              # disable
/refine once|next        # enable (today identical to on)
/refine auto|clear       # clear override → use /config

Details: #5 Self-Refine.

`/verify` — CoVe session toggle

/verify                  # current state
/verify on               # enable
/verify off              # disable
/verify once             # enable once
/verify auto             # clear override

Details: #6 CoVe.

`/reflect` — durable lesson queue

/reflect <free text of the lesson>
# Ex: /reflect when editing large Go files use Edit, not full rewrite

/reflect <free text> writes directly to memory.Fact (category=lesson, trigger=manual) without an LLM call. The queue subcommands (list/failed/retry/purge/drain) show and operate on automatic triggers (error, hallucination, low quality) buffered in the WAL awaiting async processing.

/reflect retry and /reflect purge feature dynamic autocomplete: Tab pulls live IDs from the DLQ with task preview + last error.

Full details: #3 Reflexion → Durable Queue.

Env vars — full reference

Master switch

Env var	Default	Values	Effect
`CHATCLI_QUALITY_ENABLED`	`true`	`true\|false`	Turn off the whole pipeline. `false` = `Pipeline.Run` → `agent.Execute` direct, zero overhead

Self-Refine (#5)

Env var	Default	Values	Effect
`CHATCLI_QUALITY_REFINE_ENABLED`	`false`	`true\|false`	Enable RefineHook
`CHATCLI_QUALITY_REFINE_MAX_PASSES`	`1`	int ≥ 1	Hard cap on passes
`CHATCLI_QUALITY_REFINE_MIN_BYTES`	`200`	int ≥ 0	Skip outputs smaller than N bytes
`CHATCLI_QUALITY_REFINE_EPSILON`	`50`	int ≥ 0	Char-level fallback threshold
`CHATCLI_QUALITY_REFINE_EXCLUDE`	`formatter,deps,refiner,verifier`	CSV	Agents that don’t pass through refine

Semantic convergence cascade

Env var	Default	Values	Effect
`CHATCLI_QUALITY_REFINE_CONVERGENCE_ENABLED`	`true`	`true\|false`	Cascade master switch (char→Jaccard→embedding)
`CHATCLI_QUALITY_REFINE_CONVERGENCE_EMBEDDING`	`false`	`true\|false`	Include embedding scorer (opt-in — costs $)
`CHATCLI_QUALITY_REFINE_CONVERGENCE_STRICT`	`false`	`true\|false`	Strict: refuse convergence without embedding
`CHATCLI_QUALITY_REFINE_CONVERGENCE_CHAR_HIGH`	`0.99`	0.0-1.0	Char short-circuit CONVERGED
`CHATCLI_QUALITY_REFINE_CONVERGENCE_CHAR_LOW`	`0.3`	0.0-1.0	Char short-circuit DIVERGED
`CHATCLI_QUALITY_REFINE_CONVERGENCE_JACCARD_HIGH`	`0.95`	0.0-1.0	Jaccard short-circuit CONVERGED
`CHATCLI_QUALITY_REFINE_CONVERGENCE_EMBEDDING_SIM`	`0.92`	0.0-1.0	Final embedding cosine threshold
`CHATCLI_QUALITY_REFINE_CONVERGENCE_CACHE_SIZE`	`256`	int	LRU cache size
`CHATCLI_QUALITY_REFINE_CONVERGENCE_CACHE_TTL_MIN`	`5`	int	Cache TTL (minutes)
`CHATCLI_QUALITY_REFINE_CONVERGENCE_BREAKER_THRESHOLD`	`3`	int	Failures before breaker opens

CoVe (#6)

Env var	Default	Values	Effect
`CHATCLI_QUALITY_VERIFY_ENABLED`	`false`	`true\|false`	Enable VerifyHook
`CHATCLI_QUALITY_VERIFY_NUM_QUESTIONS`	`3`	int 1-7	Number of verification questions
`CHATCLI_QUALITY_VERIFY_REWRITE`	`true`	`true\|false`	Rewrite output on discrepancy
`CHATCLI_QUALITY_VERIFY_EXCLUDE`	`formatter,deps,shell,refiner,verifier`	CSV	Agents that don’t pass through verify

Reflexion (#3)

Env var	Default	Values	Effect
`CHATCLI_QUALITY_REFLEXION_ENABLED`	`true`	`true\|false`	Enable ReflexionHook
`CHATCLI_QUALITY_REFLEXION_ON_ERROR`	`true`	`true\|false`	Fire on worker error
`CHATCLI_QUALITY_REFLEXION_ON_HALLUCINATION`	`true`	`true\|false`	Fire on `verified_with_discrepancy`
`CHATCLI_QUALITY_REFLEXION_ON_LOW_QUALITY`	`false`	`true\|false`	Fire on `refine_low_quality`
`CHATCLI_QUALITY_REFLEXION_PERSIST`	`true`	`true\|false`	Write to memory.Fact

Durable queue (WAL + worker pool + DLQ)

Env var	Default	Values	Effect
`CHATCLI_QUALITY_REFLEXION_QUEUE_ENABLED`	`true`	`true\|false`	Enable the queue (default). `false` = legacy mode (goroutine)
`CHATCLI_QUALITY_REFLEXION_QUEUE_WORKERS`	`2`	int ≥ 1	Workers processing in parallel
`CHATCLI_QUALITY_REFLEXION_QUEUE_CAPACITY`	`1000`	int	In-memory cap
`CHATCLI_QUALITY_REFLEXION_QUEUE_DROP_OLDEST`	`false`	`true\|false`	`true` = drop oldest; `false` = block
`CHATCLI_QUALITY_REFLEXION_QUEUE_BLOCK_TIMEOUT`	`5s`	duration	Enqueue timeout when queue is full
`CHATCLI_QUALITY_REFLEXION_QUEUE_MAX_ATTEMPTS`	`5`	int ≥ 1	Total retries
`CHATCLI_QUALITY_REFLEXION_QUEUE_INITIAL_DELAY`	`1s`	duration	First retry delay
`CHATCLI_QUALITY_REFLEXION_QUEUE_MAX_DELAY`	`5m`	duration	Cap on exponential backoff
`CHATCLI_QUALITY_REFLEXION_QUEUE_JITTER`	`0.2`	0.0-0.5	Fractional jitter
`CHATCLI_QUALITY_REFLEXION_QUEUE_JOB_TIMEOUT`	`2m`	duration	Per-job timeout
`CHATCLI_QUALITY_REFLEXION_QUEUE_STALE_AFTER`	`168h`	duration	Old records discarded on replay
`CHATCLI_QUALITY_REFLEXION_QUEUE_BASE_DIR`	—	path	Queue directory override

Plan-and-Solve (#2)

Env var	Default	Values	Effect
`CHATCLI_QUALITY_PLAN_FIRST_MODE`	`auto`	`off\|auto\|always`	When to fire
`CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD`	`6`	int 0-10	Minimum score for auto to fire

RAG + HyDE (#4)

Env var	Default	Values	Effect
`CHATCLI_QUALITY_HYDE_ENABLED`	`false`	`true\|false`	Enable phase 3a (keyword expansion)
`CHATCLI_QUALITY_HYDE_USE_VECTORS`	`false`	`true\|false`	Enable phase 3b (vector search)
`CHATCLI_QUALITY_HYDE_NUM_KEYWORDS`	`5`	int ≥ 1	Hypothesis keyword cap

Embedding providers (used by HyDE 3b)

Env var	Default	Values	Effect
`CHATCLI_EMBED_PROVIDER`	`null`	`voyage` / `openai` / `bedrock` / `null`	Embedding backend picker
`CHATCLI_EMBED_MODEL`	provider default	string	Voyage: `voyage-3`. OpenAI: `text-embedding-3-small` / `-large`. Bedrock: `amazon.titan-embed-text-v2:0` (default), `amazon.titan-embed-text-v1`, `cohere.embed-english-v3`, `cohere.embed-multilingual-v3`.
`CHATCLI_EMBED_DIMENSIONS`	model native	int	OpenAI: truncate via Matryoshka (`text-embedding-3-small`=1536, `-large`=3072 native). Bedrock Titan v2: accepts 256 / 512 / 1024 (rejects others). Bedrock Titan v1 / Cohere v3: fixed dim, ignored.
`VOYAGE_API_KEY`	—	string	Required for `provider=voyage`
`OPENAI_API_KEY`	—	string	Required for `provider=openai` (uses the same chat key)
`BEDROCK_REGION` / `AWS_REGION` / `AWS_PROFILE` / AWS credentials	per chat Bedrock	—	Required for `provider=bedrock` — reuses the same chain as the AWS Bedrock feature (IAM role, SSO, assume-role, profile).

Bedrock embeddings support Titan (single text per call — parallelized internally with an 8-worker pool) and Cohere v3 (native batch). Family dispatch is automatic from the model id prefix. See RAG + HyDE for the full architecture.

Reasoning Backbone (#7)

Env var	Default	Values	Effect
`CHATCLI_QUALITY_REASONING_MODE`	`auto`	`off\|auto\|on`	Auto-attach policy
`CHATCLI_QUALITY_REASONING_BUDGET`	`8000`	int (tokens)	Thinking budget (Anthropic); mapped to tier on OpenAI
`CHATCLI_QUALITY_REASONING_AUTO_AGENTS`	`planner,refiner,verifier,reflexion`	CSV	List for mode=auto

Per-agent overrides (apply to any built-in)

Env var	Effect
`CHATCLI_AGENT_<NAME>_MODEL`	Force specific model for that agent
`CHATCLI_AGENT_<NAME>_EFFORT`	Force effort tier (`low\|medium\|high\|max`)

<NAME> is the uppercase name. For the new ones: REFINER, VERIFIER.

# Examples
export CHATCLI_AGENT_REFINER_MODEL="claude-haiku-4-5"
export CHATCLI_AGENT_VERIFIER_MODEL="claude-opus-4-8"
export CHATCLI_AGENT_VERIFIER_EFFORT="max"

Recommended presets

Cheap dev (default)
Rigorous code review
Technical docs
Incident investigation
Batch autopilot

# No adjustments: chatcli out-of-the-box is already cheap.
# ReAct + Reasoning auto for Planner + Reflexion on (only on errors)

Extra cost per turn: zero normally; +1 LLM call on rare errors.

export CHATCLI_QUALITY_REFINE_ENABLED=true
export CHATCLI_QUALITY_REFINE_MAX_PASSES=2
export CHATCLI_QUALITY_VERIFY_ENABLED=true
export CHATCLI_QUALITY_VERIFY_NUM_QUESTIONS=5

Extra cost: +3-4 calls per worker with non-mechanical output. Use in critical PR reviews.

export CHATCLI_QUALITY_REFINE_ENABLED=true
export CHATCLI_QUALITY_VERIFY_ENABLED=true
export CHATCLI_QUALITY_HYDE_ENABLED=true
# Optional: HyDE vectors for max recall
export CHATCLI_QUALITY_HYDE_USE_VECTORS=true

# Pick an embedding provider (any of these works):
# — Voyage (Anthropic-recommended)
export CHATCLI_EMBED_PROVIDER=voyage
export VOYAGE_API_KEY=pa-...

# — OpenAI (if OPENAI_API_KEY is already in the environment)
# export CHATCLI_EMBED_PROVIDER=openai

# — Bedrock (same AWS chain as chat, no extra API key)
# export CHATCLI_EMBED_PROVIDER=bedrock
# export CHATCLI_EMBED_MODEL=amazon.titan-embed-text-v2:0
# export BEDROCK_REGION=us-east-1

Extra cost: +1 HyDE hypothesis + refine + verify per doc generation. Polished and factually-checked output.

export CHATCLI_QUALITY_REASONING_MODE=on
export CHATCLI_QUALITY_REASONING_BUDGET=16384
export CHATCLI_QUALITY_REFLEXION_ON_LOW_QUALITY=true

Extra cost: max thinking across all agents + generous lesson generation. Use in root-cause analysis.

export CHATCLI_QUALITY_PLAN_FIRST_MODE=always
export CHATCLI_QUALITY_REFINE_ENABLED=true
export CHATCLI_QUALITY_VERIFY_ENABLED=true
export CHATCLI_QUALITY_REASONING_MODE=on

Extra cost: all patterns always on. Use in workflows without a user in front (cron, CI).

Turn everything off

If something breaks in production and you need to return to pre-pipeline behavior instantly:

export CHATCLI_QUALITY_ENABLED=false

This makes Pipeline.Run degenerate to return agent.Execute(...) — byte-identical to pre-PR chatcli. Zero hooks, zero reasoning auto-attach, zero HyDE.

The master switch is the emergency exit. Individual toggles are for tuning scenarios; the master is for rollback.

Interaction with other configs

The harness/quality pipeline does not conflict with, nor replace:

Feature	Relationship
Skills (with frontmatter `effort:`)	Wins over `applyAutoReasoning` (skill hint already on ctx)
Personas (custom agents)	Can have `effort:` in their `.md`; per-agent override still works
Multi-agent orchestration	Dispatcher stays identical; pipeline only wraps each worker
Policy (CODER policy rules)	Applied inside the worker, independent of refine/verify/reflexion
MCP tools	Result of MCP tools goes through the pipeline like any other worker output

Seven patterns overview

Back to the hub.

Environment Variables (full)

The whole ChatCLI env var matrix.

Command Reference

Every slash.

/config (hierarchical)

How /config <section> was structured.

​/config quality

​Slash commands

​/thinking — reasoning override

​/plan — force Plan-and-Solve

​/refine — Self-Refine session toggle

​/verify — CoVe session toggle

​/reflect — durable lesson queue

​Env vars — full reference

​Master switch

​Self-Refine (#5)

​Semantic convergence cascade

​CoVe (#6)

​Reflexion (#3)

​Durable queue (WAL + worker pool + DLQ)

​Plan-and-Solve (#2)

​RAG + HyDE (#4)

​Embedding providers (used by HyDE 3b)

​Reasoning Backbone (#7)

​Per-agent overrides (apply to any built-in)

​Recommended presets

​Turn everything off

​Interaction with other configs

​See also

Seven patterns overview

Environment Variables (full)

Command Reference

/config (hierarchical)

`/config quality`

Slash commands

`/thinking` — reasoning override

`/plan` — force Plan-and-Solve

`/refine` — Self-Refine session toggle

`/verify` — CoVe session toggle

`/reflect` — durable lesson queue

Env vars — full reference

Master switch

Self-Refine (#5)

Semantic convergence cascade

CoVe (#6)

Reflexion (#3)

Durable queue (WAL + worker pool + DLQ)

Plan-and-Solve (#2)

RAG + HyDE (#4)

Embedding providers (used by HyDE 3b)

Reasoning Backbone (#7)

Per-agent overrides (apply to any built-in)

Recommended presets

Turn everything off

Interaction with other configs

See also