This page lists all environment variables that ChatCLI recognizes. Configure them in your .env file or via export in the shell.
General
| Variable | Description | Default |
|---|
LLM_PROVIDER | Active provider: OPENAI, OPENAI_ASSISTANT, CLAUDEAI, BEDROCK, GOOGLEAI, XAI, ZAI, MINIMAX, MOONSHOT, OPENROUTER, STACKSPOT, OLLAMA, COPILOT, GITHUB_MODELS | Auto-detected |
CHATCLI_DOTENV | Custom path for the .env file | .env |
CHATCLI_LANG | Force interface language (pt-BR, en) | Auto-detected |
CHATCLI_IGNORE | Path to .chatignore rules file | Auto-detected |
CHATCLI_THEME | Interface color theme (chat, cards, markdown, spinners). 11 themes: dark, light + 9 community palettes (dracula, nord, tokyo-night, solarized-dark/light, gruvbox, catppuccin-mocha, monokai, one-dark). Switch at runtime via /config ui theme. See the Theme System. | dark |
LOG_LEVEL | Log level: debug, info, warn, error | info |
LOG_FILE | Log file path | ~/.chatcli/app.log |
LOG_MAX_SIZE | Maximum log size before rotation | 100MB |
CHATCLI_ENV | Logging mode: dev (colored console + file), prod (file-only JSON). Backward-compatible with legacy ENV. | prod |
MAX_RETRIES | Maximum retries for API calls | 5 |
INITIAL_BACKOFF | Initial time between retries (seconds) | 3 |
HISTORY_FILE | Path to history file (supports ~) | .chatcli_history |
HISTORY_MAX_SIZE | Maximum history size before rotation | 100MB |
LLM Providers
OpenAI
| Variable | Description | Default |
|---|
OPENAI_API_KEY | API key | — |
OPENAI_MODEL | Model to use | gpt-5.4 |
OPENAI_ASSISTANT_MODEL | Model for assistant (agent mode) | gpt-4o |
OPENAI_MAX_TOKENS | Response token limit | 60000 |
OPENAI_USE_RESPONSES | Use Responses API (e.g., for gpt-5) | false |
Anthropic (Claude)
| Variable | Description | Default |
|---|
ANTHROPIC_API_KEY | API key | — |
ANTHROPIC_MODEL | Model to use (e.g., claude-opus-4-8, claude-opus-4-7, claude-sonnet-4-6) | claude-sonnet-4-6 |
ANTHROPIC_MAX_TOKENS | Response token limit | 20000 |
ANTHROPIC_API_VERSION | API version | 2023-06-01 |
ANTHROPIC_SPEED | Set to fast to opt in to Opus 4.8 fast mode (research preview, ~2.5× output tokens/sec at premium pricing). Silently ignored on models without the fast_mode catalog capability — safe to leave set when switching models. | — |
ANTHROPIC_1MTOKENS_SONNET | Enable the 1M-context beta header on Sonnet 4.x OAuth requests. Opus 4.7/4.8 already ship with 1M native context and ignore this flag. | false |
AWS Bedrock (full catalog)
BEDROCK provider — invokes the entire Bedrock catalog (Anthropic, OpenAI, Llama, Nova, Mistral, Cohere, AI21, DeepSeek, Moonshot Kimi (also available directly via MOONSHOT), MiniMax, Qwen, Z.AI/GLM, Gemma, Nemotron, TwelveLabs, and any provider AWS adds) using the SDK’s credential chain (env vars, ~/.aws/credentials, SSO via ~/.aws/config, IAM role). Activation requires real credentials — the mere existence of ~/.aws/config with only region/output does not activate Bedrock.
| Variable | Description | Default |
|---|
BEDROCK_PROVIDER | Schema override: anthropic / claude, openai / gpt, converse / auto. Overrides prefix auto-detection. | auto-detect |
BEDROCK_TEMPERATURE | Temperature (used by OpenAI and Converse paths) | — |
BEDROCK_TOP_P | Top-p sampling (used by the Converse path) | — |
BEDROCK_REGION | AWS region (takes precedence over AWS_REGION) | — |
AWS_REGION | AWS region (fallback) | — |
AWS_PROFILE | Profile in ~/.aws/credentials or ~/.aws/config (SSO, assume-role, credential_process). Can be set in .env. | — |
AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN | Static credentials | — |
AWS_CA_BUNDLE | PEM bundle read natively by the SDK | — |
AWS_ENDPOINT_URL_BEDROCK_RUNTIME | Override for Bedrock Runtime endpoint (VPC endpoints) | — |
AWS_ENDPOINT_URL_BEDROCK | Override for Bedrock (control plane) endpoint | — |
AWS_EC2_METADATA_DISABLED | true explicitly disables IMDS (169.254.169.254) | — |
CHATCLI_BEDROCK_ENABLE_IMDS | 1/true forces IMDS probe on non-EC2 machines | false |
BEDROCK_MAX_TOKENS | Response token limit | From catalog |
CHATCLI_BEDROCK_CA_BUNDLE | Bedrock-specific PEM bundle. Overrides AWS_CA_BUNDLE and the global CHATCLI_CA_BUNDLE (which acts as fallback). See Global TLS Trust. | — |
CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY | true disables TLS verification (insecure, troubleshooting only). Overrides the global CHATCLI_TLS_INSECURE_SKIP_VERIFY (which acts as fallback). | false |
Modern models on Bedrock (Claude 3.7+/4.x/4.5/4.6/4.7 and equivalents from other providers) require inference profile IDs (prefix global., us., eu., or apac.). /switch --model automatically filters out base IDs that don’t support ON_DEMAND, so only invokable IDs + profiles appear in the listing. See AWS Bedrock for details.
Google AI (Gemini)
| Variable | Description | Default |
|---|
GOOGLEAI_API_KEY | API key | — |
GOOGLEAI_MODEL | Model to use | gemini-2.5-flash |
GOOGLEAI_MAX_TOKENS | Response token limit | 50000 |
xAI (Grok)
| Variable | Description | Default |
|---|
XAI_API_KEY | API key | — |
XAI_MODEL | Model to use | grok-4-1 |
XAI_MAX_TOKENS | Response token limit | 50000 |
Ollama (Local Models)
| Variable | Description | Default |
|---|
OLLAMA_ENABLED | Enable Ollama API (required) | false |
OLLAMA_BASE_URL | Ollama server URL | http://localhost:11434 |
OLLAMA_MODEL | Model to use | — |
OLLAMA_MAX_TOKENS | Response token limit | 5000 |
OLLAMA_FILTER_THINKING | Filters intermediate reasoning from models like Qwen3 | true |
For Agent mode to work well with some Ollama models that “think out loud” (Qwen3, Llama3…), keep OLLAMA_FILTER_THINKING=true.
ZAI (Zhipu AI)
| Variable | Description | Default |
|---|
ZAI_API_KEY | API key (Bearer token or id.secret for JWT) | — |
ZAI_MODEL | Model to use | glm-5 |
ZAI_MAX_TOKENS | Response token limit | 50000 |
Keys in id.secret format automatically enable JWT token rotation (HMAC-SHA256). Tokens are cached for 30 minutes with a 5-minute safety margin before regeneration. Keys without ”.” continue to work as traditional Bearer tokens. No additional configuration is needed.
MiniMax
| Variable | Description | Default |
|---|
MINIMAX_API_KEY | API key | — |
MINIMAX_MODEL | Model to use | MiniMax-M2.7 |
MINIMAX_MAX_TOKENS | Response token limit | 50000 |
MINIMAX_API_COMPAT | API compatibility mode: anthropic to use the Anthropic Messages endpoint | — |
Set MINIMAX_API_COMPAT=anthropic to use the Anthropic Messages-compatible endpoint (https://api.minimax.io/anthropic/v1/messages). The anthropic-version: 2023-06-01 header is added automatically. Bearer token auth remains the same. Native tool calling is disabled in this mode (falls back to XML).
Moonshot (Kimi)
| Variable | Description | Default |
|---|
MOONSHOT_API_KEY | API key (Bearer token) | — |
MOONSHOT_MODEL | Model to use (kimi-k2.6, kimi-k2.5, kimi-latest, moonshot-v1-*) | kimi-k2.6 |
MOONSHOT_MAX_TOKENS | Response token limit | catalog (131072 for K2.6) |
MOONSHOT_THINKING | Reasoning mode: enabled, disabled, auto | auto |
MOONSHOT_API_URL | Custom endpoint URL | https://api.moonshot.ai/v1/chat/completions |
MOONSHOT_THINKING=disabled forces Instant mode (direct response, cheaper) even on models with the thinking capability. enabled forces explicit reasoning. auto (default) lets the model choose. Models without the thinking capability ignore the flag — no surprise billing.
OpenRouter
| Variable | Description | Default |
|---|
OPENROUTER_API_KEY | API key from openrouter.ai | — |
OPENROUTER_API_URL | Custom API endpoint URL | https://openrouter.ai/api/v1/chat/completions |
OPENROUTER_MAX_TOKENS | Response token limit | — |
OPENROUTER_FALLBACK_MODELS | Comma-separated list of fallback models for server-side routing (e.g., anthropic/claude-sonnet-4,google/gemini-2.5-flash) | — |
OPENROUTER_PROVIDER_ORDER | Comma-separated preferred provider ordering (e.g., Anthropic,Google) | — |
OPENROUTER_TRANSFORMS | Message transforms (e.g., middle-out for context overflow handling) | — |
OPENROUTER_HTTP_REFERER | Attribution HTTP Referer header sent with requests | — |
OPENROUTER_APP_TITLE | Attribution app title sent with requests | — |
OPENROUTER_TOOLS | JSON array of tool definitions to inject into requests | — |
OpenRouter is a multi-provider gateway — a single API key gives access to 200+ models from OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, and more. Models use the provider/model-name format (e.g., openai/gpt-4o, anthropic/claude-sonnet-4). The default model is openai/gpt-4o.
GitHub Copilot
| Variable | Description | Default |
|---|
GITHUB_COPILOT_TOKEN | Authentication token | — |
COPILOT_MODEL | Model to use | — |
COPILOT_MAX_TOKENS | Response token limit | — |
COPILOT_API_BASE_URL | API base URL | — |
CHATCLI_COPILOT_CLIENT_ID | Custom client ID | — |
StackSpot
| Variable | Description | Default |
|---|
CLIENT_ID | Client ID | — |
CLIENT_KEY | Client Key | — |
STACKSPOT_REALM | Realm/Tenant | — |
STACKSPOT_AGENT_ID | Agent ID | — |
StackSpot’s agent API decides the output limit server-side (it ignores max_tokens in the payload). ChatCLI’s catalog assumes a 128K context window for StackSpot agents — previously the generic 50K fallback made auto-compact fire on almost every turn. If your agent’s foundation model has a different window, tune it with CHATCLI_CONTEXT_WINDOW.
Agent Mode
| Variable | Description | Default |
|---|
CHATCLI_AGENT_CMD_TIMEOUT | Timeout per executed command (Go duration: 30s, 2m, 10m) | 10m |
CHATCLI_AGENT_DENYLIST | Extra regex patterns to block commands (separated by ;) | — |
CHATCLI_AGENT_ALLOW_SUDO | Allow sudo without automatic blocking | false |
CHATCLI_AGENT_INLINE_CODE_STRICT | For python -c / node -e / perl -e / ruby -e / php -r / lua -e invocations, treat inline source as dangerous unless proven safe (conservative mode). Default: the classifier accepts read-only one-liners (print(1), import sys; print(sys.version)) and only blocks patterns with os.system, subprocess, socket, eval, exec, file writes, network. | false |
CHATCLI_AGENT_PLUGIN_MAX_TURNS | Maximum agent turns | 50 |
CHATCLI_AGENT_PLUGIN_TIMEOUT | Total agent plugin timeout | 15m |
Multi-Agent (Parallel Orchestration)
| Variable | Description | Default |
|---|
CHATCLI_AGENT_PARALLEL_MODE | Enable parallel multi-agent orchestration | true |
CHATCLI_AGENT_MAX_WORKERS | Maximum simultaneous workers (goroutines) | 4 |
CHATCLI_AGENT_WORKER_MAX_TURNS | Maximum turns per worker | 10 |
CHATCLI_AGENT_WORKER_TIMEOUT | Timeout per individual worker | 5m |
CHATCLI_AGENT_PARALLEL_TOOLS | Enables parallel execution of concurrency-safe tools inside a single agent (read-only operations such as @websearch, @webfetch, @coder read/search/tree in the same turn). Distinct from CHATCLI_AGENT_PARALLEL_MODE, which controls multiple agents running in parallel. Off by default while the feature is in rollout. | false |
CHATCLI_AGENT_MAX_TOOL_CONCURRENCY | Fan-out cap for the parallel tool batch within an agent. Operators processing many @webfetch in parallel can raise it; conservatives can lower it. | 10 |
Mixture-of-Agents (MoA)
An ensemble where several models propose an answer in parallel and an aggregator synthesizes the best one. Distinct from the parallel orchestration above (which dispatches specialist agents): here the axis is model/provider diversity. Triggered by the /moa command. See Mixture-of-Agents.
| Variable | Description | Default |
|---|
CHATCLI_MOA_MODELS | Ensemble proposers, as CSV provider:model (e.g. openai:gpt-5,claudeai:claude-opus-4-8,googleai:gemini-2.5-pro). Optional: without it, /moa and @moa use the configured providers (capped at 4). | — |
CHATCLI_MOA_AGGREGATOR | Model that synthesizes the proposals, as provider:model. If omitted, uses the session’s active provider/model. | Active provider |
Token Efficiency
Knobs for the token-saving optimizations (structured prompt caching, stagnation detector, smart routing, webfetch auto-save, microcompaction). See Token Efficiency for details.
Stagnation early-exit
| Variable | Description | Default |
|---|
CHATCLI_AGENT_EARLY_EXIT | Toggle the repeated-tool_calls detector in the ReAct loop. Accepts 0/false/off/no to disable. | 1 (on) |
CHATCLI_AGENT_EARLY_EXIT_TURNS | How many consecutive turns with the same tool_calls batch trigger the break. Clamped to [2, 10]. | 3 |
An advisory guard (it never aborts) that detects the same tool failing repeatedly inside the agent loop and injects a hint for the model to change approach instead of retrying the same error.
| Variable | Description | Default |
|---|
CHATCLI_AGENT_TOOLGUARD | Toggle the repeated tool-failure guard. Accepts 0/false to disable. | 1 (on) |
Smart chat ↔ agent routing
| Variable | Description | Default |
|---|
CHATCLI_AGENT_SMART_ROUTE | Trivial-query classifier mode. off (disabled), hint (tip only — default), auto (auto-redirect trivial → chat mode). Aliases: 0/false, 1/on/true, redirect/2. | hint |
WebFetch auto-save
| Variable | Description | Default |
|---|
CHATCLI_WEBFETCH_AUTOSAVE_BYTES | Byte threshold. @webfetch bodies larger than this AND without a filter/range are auto-persisted to the scratch dir and only a preview is returned. | 10000 |
CHATCLI_WEBFETCH_RENDER | Headless rendering of JS pages in @webfetch: auto (heuristics detect SPA shells), always, never. See Web Tools. | auto |
CHATCLI_WEBFETCH_RENDER_TIMEOUT | Headless render timeout, in seconds. | 25 |
CHATCLI_WEBFETCH_RENDER_BROWSER | Absolute path to a specific Chromium-based binary (Chrome/Chromium/Edge/Brave are auto-detected without it). | (auto-detect) |
CHATCLI_WEBFETCH_RENDER_AUTOPROVISION | Allows the one-time download of a pinned Chromium (~150 MB) when no browser is found. | false |
Applied to the session history to shrink old tool results. See also Tool Result Management.
| Variable | Description | Default |
|---|
CHATCLI_MICROCOMPACT_TRUNCATE_TURNS | After how many turns old tool results become head+tail previews. | 2 |
CHATCLI_MICROCOMPACT_SUMMARIZE_TURNS | After how many turns tool results become a one-line summary. | 4 |
CHATCLI_MICROCOMPACT_HEAD_CHARS | Head size kept during truncation. | 2000 |
CHATCLI_MICROCOMPACT_TAIL_CHARS | Tail size kept during truncation. | 500 |
CHATCLI_MICROCOMPACT_MIN_CONTENT | Minimum tool result size to be a compaction candidate. | 3000 |
For chat/lookup sessions where token frugality matters more than long-term recall, tighten the knobs:export CHATCLI_MICROCOMPACT_TRUNCATE_TURNS=1
export CHATCLI_MICROCOMPACT_SUMMARIZE_TURNS=3
export CHATCLI_MICROCOMPACT_HEAD_CHARS=1200
export CHATCLI_MICROCOMPACT_TAIL_CHARS=300
export CHATCLI_MICROCOMPACT_MIN_CONTENT=2000
Harness/Quality Pipeline (7 Patterns)
Variables for the harness/quality pipeline that implements the seven LLM-agent patterns. See full overview and detailed configuration.
Master switch
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_ENABLED | Master switch for the pipeline. false disables everything (falls back to direct agent.Execute) | true |
Self-Refine (#5)
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_REFINE_ENABLED | Enable RefineHook after each worker | false |
CHATCLI_QUALITY_REFINE_MAX_PASSES | Maximum critique-rewrite passes | 1 |
CHATCLI_QUALITY_REFINE_MIN_BYTES | Don’t refine outputs smaller than N bytes | 200 |
CHATCLI_QUALITY_REFINE_EPSILON | Char-level fallback threshold in characters | 50 |
CHATCLI_QUALITY_REFINE_EXCLUDE | CSV of agents that are not refined | formatter,deps,refiner,verifier |
Semantic convergence cascade (char → Jaccard → embedding)
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_REFINE_CONVERGENCE_ENABLED | Enable the cascade; false = char-level heuristic only | true |
CHATCLI_QUALITY_REFINE_CONVERGENCE_EMBEDDING | Include embedding scorer (requires CHATCLI_EMBED_PROVIDER) | false |
CHATCLI_QUALITY_REFINE_CONVERGENCE_STRICT | Strict mode: refuse convergence without embedding | false |
CHATCLI_QUALITY_REFINE_CONVERGENCE_CHAR_HIGH | Char sim ≥ X → short-circuit CONVERGED | 0.99 |
CHATCLI_QUALITY_REFINE_CONVERGENCE_CHAR_LOW | Char sim < X → short-circuit DIVERGED | 0.3 |
CHATCLI_QUALITY_REFINE_CONVERGENCE_JACCARD_HIGH | Jaccard sim ≥ X + high confidence → CONVERGED | 0.95 |
CHATCLI_QUALITY_REFINE_CONVERGENCE_EMBEDDING_SIM | Final embedding cosine threshold | 0.92 |
CHATCLI_QUALITY_REFINE_CONVERGENCE_CACHE_SIZE | Embedding LRU cache entries | 256 |
CHATCLI_QUALITY_REFINE_CONVERGENCE_CACHE_TTL_MIN | Cache TTL (minutes) | 5 |
CHATCLI_QUALITY_REFINE_CONVERGENCE_BREAKER_THRESHOLD | Consecutive embedder failures before breaker opens | 3 |
Chain-of-Verification (#6)
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_VERIFY_ENABLED | Enable VerifyHook after each worker | false |
CHATCLI_QUALITY_VERIFY_NUM_QUESTIONS | Number of verification questions | 3 |
CHATCLI_QUALITY_VERIFY_REWRITE | Rewrite output when discrepancy detected | true |
CHATCLI_QUALITY_VERIFY_EXCLUDE | CSV of agents that are not verified | formatter,deps,shell,refiner,verifier |
Reflexion (#3)
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_REFLEXION_ENABLED | Master switch for ReflexionHook | true |
CHATCLI_QUALITY_REFLEXION_ON_ERROR | Fire on worker error | true |
CHATCLI_QUALITY_REFLEXION_ON_HALLUCINATION | Fire when CoVe flags discrepancy | true |
CHATCLI_QUALITY_REFLEXION_ON_LOW_QUALITY | Fire when refine returns a low score | false |
CHATCLI_QUALITY_REFLEXION_PERSIST | Persist lessons to memory.Fact | true |
Durable queue (WAL + worker pool + DLQ)
When enabled, reflexion triggers flow through a persistent queue — lessons survive process crashes via WAL replay on next boot.
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_REFLEXION_QUEUE_ENABLED | Queue master switch. false reverts to legacy detached-goroutine mode | true |
CHATCLI_QUALITY_REFLEXION_QUEUE_WORKERS | Worker pool size | 2 |
CHATCLI_QUALITY_REFLEXION_QUEUE_CAPACITY | Max in-memory depth before overflow policy | 1000 |
CHATCLI_QUALITY_REFLEXION_QUEUE_DROP_OLDEST | Overflow: true = drop oldest; false = block | false |
CHATCLI_QUALITY_REFLEXION_QUEUE_BLOCK_TIMEOUT | Enqueue wait time when queue is full (Go duration) | 5s |
CHATCLI_QUALITY_REFLEXION_QUEUE_MAX_ATTEMPTS | Total retries before moving to DLQ | 5 |
CHATCLI_QUALITY_REFLEXION_QUEUE_INITIAL_DELAY | First retry delay | 1s |
CHATCLI_QUALITY_REFLEXION_QUEUE_MAX_DELAY | Cap on exponential retry | 5m |
CHATCLI_QUALITY_REFLEXION_QUEUE_JITTER | Fractional jitter ([0, 0.5]) | 0.2 |
CHATCLI_QUALITY_REFLEXION_QUEUE_JOB_TIMEOUT | Per-processor-call timeout (LLM + persist) | 2m |
CHATCLI_QUALITY_REFLEXION_QUEUE_STALE_AFTER | Records older than this discarded on replay | 168h |
CHATCLI_QUALITY_REFLEXION_QUEUE_BASE_DIR | Override of wal/ and dlq/ root directory | — |
Plan-and-Solve / ReWOO (#2)
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_PLAN_FIRST_MODE | off|auto|always — when to fire Plan-First | auto |
CHATCLI_QUALITY_PLAN_FIRST_THRESHOLD | Minimum score (0-10) for auto to fire | 6 |
RAG + HyDE (#4)
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_HYDE_ENABLED | Enable phase 3a (hypothesis-based keyword expansion) | false |
CHATCLI_QUALITY_HYDE_USE_VECTORS | Enable phase 3b (cosine vector search) | false |
CHATCLI_QUALITY_HYDE_NUM_KEYWORDS | Cap on keywords extracted from the hypothesis | 5 |
Embedding Providers (used by HyDE 3b)
| Variable | Description | Default |
|---|
CHATCLI_EMBED_PROVIDER | voyage / openai / bedrock / null | null |
CHATCLI_EMBED_MODEL | Embedding model. Provider defaults: Voyage voyage-3; OpenAI text-embedding-3-small; Bedrock amazon.titan-embed-text-v2:0. | provider default |
CHATCLI_EMBED_DIMENSIONS | OpenAI: truncate via Matryoshka. Bedrock Titan v2: accepts 256 / 512 / 1024 (rejects others). Bedrock Titan v1 / Cohere v3: fixed dim, ignored. | model native |
VOYAGE_API_KEY | Required for provider=voyage | — |
OPENAI_API_KEY | Required for provider=openai (shared with chat) | — |
BEDROCK_REGION / AWS_REGION / AWS_PROFILE / AWS credentials | Required for provider=bedrock — reuses the same chain as chat. | per chat Bedrock |
Bedrock embeddings support amazon.titan-embed-text-v2:0 (default), amazon.titan-embed-text-v1, cohere.embed-english-v3, and cohere.embed-multilingual-v3. Titan parallelizes batches with an 8-worker pool (the API accepts only 1 text per call); Cohere v3 sends the entire batch in one call. Dispatch is automatic from the model id prefix. See RAG + HyDE and AWS Bedrock.
Reasoning Backbone (#7)
| Variable | Description | Default |
|---|
CHATCLI_QUALITY_REASONING_MODE | off|auto|on — auto-attach policy for thinking/reasoning | auto |
CHATCLI_QUALITY_REASONING_BUDGET | Thinking tokens (Anthropic); mapped to tier on OpenAI | 8000 |
CHATCLI_QUALITY_REASONING_AUTO_AGENTS | CSV of agents that receive effort hint automatically in mode=auto | planner,refiner,verifier,reflexion |
Per-Agent Overrides (includes refiner / verifier)
| Variable | Description |
|---|
CHATCLI_AGENT_REFINER_MODEL | Specific model for RefinerAgent |
CHATCLI_AGENT_REFINER_EFFORT | Effort tier (low|medium|high|max) |
CHATCLI_AGENT_VERIFIER_MODEL | Specific model for VerifierAgent |
CHATCLI_AGENT_VERIFIER_EFFORT | Effort tier |
Session Workspace and Subagent
Variables that control the Session Workspace (scratch dir + tool-result overflow) and Subagent Delegation.
| Variable | Description | Default |
|---|
CHATCLI_AGENT_TMPDIR | Read-only — exported automatically by ChatCLI at startup with the absolute path of the session scratch dir. Available to every subprocess launched by the agent’s exec. | (set at startup) |
CHATCLI_AGENT_KEEP_TMPDIR | If true, skip the scratch dir cleanup at session end. Useful for inspecting temporary scripts and overflow files after exit. | false |
CHATCLI_BLOCK_TMP_WRITES | If true, blocks the automatic allowlist extension to os.TempDir() and /tmp. Only the session-isolated scratch dir stays accessible. Use on multi-user / strict CI. See Session Workspace. | false (allows) |
CHATCLI_AGENT_SUBAGENT_MAX_DEPTH | Maximum nested delegation depth via delegate_subagent. Protects against pathological recursion. | 2 |
CHATCLI_AGENT_SUBAGENT_MAX_TURNS | Maximum ReAct iterations of each subagent’s internal loop. | 15 |
CHATCLI_TOOL_RESULT_BUDGET_CHARS | Aggregate tool-result budget per turn. Above it, the largest results are saved to the session tool-results/ and replaced by a preview. | 200000 |
CHATCLI_TOOL_RESULT_MAX_CHARS | Maximum size of a single inline tool result. | 20000 |
Global TLS Trust (Corporate Proxy)
For environments behind a corporate proxy/gateway performing TLS inspection with a private CA (Zscaler, Netskope, CrowdStrike Falcon, etc.). Both variables apply to every outbound HTTPS connection in the process: all LLM providers, TTS/STT, embeddings, web tools (@webfetch/@websearch/@osv), gateway channels, MCP transports, skill registries and the version check. The process-wide equivalent of NODE_EXTRA_CA_CERTS / NODE_TLS_REJECT_UNAUTHORIZED in Node.js tools such as Claude Code.
| Variable | Description | Default |
|---|
CHATCLI_CA_BUNDLE | Path to a PEM bundle with the corporate CA. Merged into the system cert pool and used as RootCAs on every outbound connection (equivalent to NODE_EXTRA_CA_CERTS). An unreadable bundle or one with no valid certificates fails open to default verification, with a log warning — it never silently weakens verification. | (system trust store) |
CHATCLI_TLS_INSECURE_SKIP_VERIFY | true disables TLS verification entirely, on every connection (equivalent to NODE_TLS_REJECT_UNAUTHORIZED=0). Logs a loud warning. Insecure — use only to confirm the problem is TLS trust; then configure CHATCLI_CA_BUNDLE. | false |
Go — and therefore ChatCLI — already trusts the operating system’s cert store by default (Keychain on macOS, the Windows cert store, /etc/ssl on Linux). If the corporate CA is already installed on the machine, no configuration is needed. CHATCLI_CA_BUNDLE covers the case where the CA cannot be installed system-wide.
With CHATCLI_TLS_INSECURE_SKIP_VERIFY=true ChatCLI accepts any certificate — invalid, expired, forged — and is exposed to man-in-the-middle attacks (capture of API keys, code and credentials). Never use it in production.
The Bedrock-specific variables (CHATCLI_BEDROCK_CA_BUNDLE / CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY, in the AWS Bedrock section) take precedence over the global ones; when absent, Bedrock inherits the globals as fallback, since the AWS SDK builds its own HTTP client.
Quick diagnosis: unresolved TLS inspection produces an x509 error (certificate signed by unknown authority) — that is what CHATCLI_CA_BUNDLE fixes. A 403 comes from a layer above (WAF/proxy policy: User-Agent, fingerprint, authentication) and is not solved by TLS trust — for that, see the web proxy variables (HTTPS_PROXY, CHATCLI_PROXY_AUTH) in /config resilience.
History Compaction and Payload Recovery
Controls the HistoryCompactor (3-level pipeline: trim → summarize → emergency) and the reactive detection of corporate-proxy/gateway limits. See Context Recovery.
| Variable | Description | Default |
|---|
CHATCLI_CONTEXT_WINDOW | Global context-window override (in tokens), valid for any provider/model. Takes precedence over the model catalog and the per-provider fallbacks. It is the escape hatch for gateways/agents whose real window differs from the catalog — e.g. a StackSpot agent backed by a larger or smaller foundation model. Since the compaction budget derives from the context window, this env directly controls when auto-compact fires. Accepts a positive integer only; invalid values are silently ignored. | (auto from catalog) |
CHATCLI_MAX_PAYLOAD | Human-friendly ceiling for POST body size. Accepts 5MB, 512KB, 2.5MB, 5 (= 5 MB). When set, the compactor respects this limit as an extra cap on top of the model’s context window, and pre-flight forces aggressive compaction on crossing 85% of it. Essential behind corporate proxies (Cloudflare, Akamai, WAF) that cap body size at 1-5 MB regardless of model context. | (unset — no cap) |
CHATCLI_MAX_RECOVERY_ATTEMPTS | Maximum context-overflow recovery attempts per session. | 3 |
CHATCLI_MAX_TOKEN_ESCALATIONS | Maximum max_tokens escalations per session. | 2 |
CHATCLI_EMERGENCY_KEEP_MESSAGES | Messages kept during Level 2 emergency truncation. | 10 |
CHATCLI_MICROCOMPACT_TRUNCATE_TURNS | Age (in turns) at which old tool results start being truncated to head+tail preview — no LLM call. First line of defense before NeedsCompaction fires. | 2 |
CHATCLI_MICROCOMPACT_SUMMARIZE_TURNS | Age (in turns) at which old tool results get replaced by a one-line summary ([Old tool result cleared — N lines, N chars, type]). | 4 |
If you are behind a corporate proxy and don’t know the exact limit, start with CHATCLI_MAX_PAYLOAD=4MB. If you still take 413s, drop to 3MB. If everything flows smoothly, push up to 5MB or 10MB (depends on the proxy). ChatCLI already leaves a 30% headroom for JSON overhead + system prompt + tool definitions.
If a 413/WAF/EOF hits without CHATCLI_MAX_PAYLOAD set, ChatCLI automatically assumes 4 MB for the rest of the session (reactive auto-cap) and injects a hint into history telling the AI to prefer line-ranged reads. This prevents a failure loop when the user hasn’t discovered the proxy’s real limit yet.
/coder and /agent UI
| Variable | Description | Default |
|---|
CHATCLI_CODER_UI | Timeline style: full · compact · minimal. Applies to /coder AND /agent (cross-mode since v1.119). | full |
CHATCLI_CODER_BANNER | Show the /coder cheat sheet on session entry | true |
Available styles:
full (default) — full bordered cards ╭── ICON TITLE ─────╮ … ╰─╯. Each tool call is a highlighted action. Best for /agent (supervised plan-and-execute).
compact — inline lines ↻ Read(main.go) / ✓ Read(main.go) 0.3s. Long sessions with dozens of tools stay scannable. Best for /coder.
minimal — middle ground: smaller cards with truncated content.
The variable keeps the legacy name CHATCLI_CODER_UI for back-compat, but as of v1.119 it controls BOTH modes. If you had CHATCLI_CODER_UI=compact set, /agent will now also render compactly.
Color change also in v1.119: tool failures (✗, ❌ EXECUTION FAILED) now render in true red instead of purple. Previously they shared the header color (ColorPurple), which was confusing. If your terminal maps ANSI 31 (red) to a non-red color via theme, adjust the palette.
Provider Fallback
| Variable | Description | Default |
|---|
CHATCLI_FALLBACK_PROVIDERS | Comma-separated list of providers | — |
CHATCLI_FALLBACK_MODEL_<PROVIDER> | Specific model per provider in the chain | — |
CHATCLI_FALLBACK_MAX_RETRIES | Retries per provider before advancing | 2 |
CHATCLI_FALLBACK_COOLDOWN_BASE | Base cooldown after failure | 30s |
CHATCLI_FALLBACK_COOLDOWN_MAX | Maximum cooldown (exponential backoff) | 5m |
MCP (Model Context Protocol)
| Variable | Description | Default |
|---|
CHATCLI_MCP_ENABLED | Enable MCP manager | false |
CHATCLI_MCP_CONFIG | Path to MCP configuration JSON | ~/.chatcli/mcp_servers.json |
The MCP subsystem also manages ~/.chatcli/mcp/ for durable state: channels.jsonl (durable push-notification ring, with rotation) and triggers.json (opt-in trigger engine rules). These paths are fixed — no environment variables override them. Details: MCP Channels and MCP Config.
Bootstrap and Memory
| Variable | Description | Default |
|---|
CHATCLI_BOOTSTRAP_ENABLED | Enable bootstrap file loading | true |
CHATCLI_BOOTSTRAP_DIR | Bootstrap files directory | — |
CHATCLI_MEMORY_ENABLED | Enable persistent memory system | true |
CHATCLI_MEMORY_MAX_SIZE | Maximum size of rendered MEMORY.md (bytes) | 32768 |
CHATCLI_MEMORY_RETENTION_DAYS | Days to retain daily notes before cleanup | 30 |
CHATCLI_MEMORY_MAX_FACTS | Maximum number of facts in memory index | 500 |
CHATCLI_MEMORY_RETRIEVAL_BUDGET | Maximum memory characters in system prompt | 4000 |
CHATCLI_MEMORY_FALLBACK_PROVIDERS | Fallback provider chain for memory extraction (e.g. OPENAI,GOOGLEAI). Empty → falls back to CHATCLI_FALLBACK_PROVIDERS. Segments that fail every provider are queued on disk and retried. | — |
Scheduler
See Scheduler (Chronos) for the full design. All boolean variables accept enabled/disabled/true/false/1/0.
Core
| Variable | Description | Default |
|---|
CHATCLI_SCHEDULER_ENABLED | Master switch | true |
CHATCLI_SCHEDULER_DATA_DIR | Directory for WAL + snapshot + audit log | ~/.chatcli/scheduler |
CHATCLI_SCHEDULER_MAX_JOBS | Max simultaneous non-terminal jobs | 256 |
CHATCLI_SCHEDULER_WORKER_COUNT | Worker pool goroutines running handleJob | 4 |
CHATCLI_SCHEDULER_WAIT_WORKER_COUNT | Goroutines dedicated to wait loops (reserved, shared pool today) | 8 |
CHATCLI_SCHEDULER_ALLOW_AGENTS | Allow agents to create jobs via @scheduler | true |
CHATCLI_SCHEDULER_ACTION_ALLOWLIST | CSV of allowed action types | slash_cmd,llm_prompt,agent_task,worker_dispatch,hook,noop,webhook,shell,agent_resume,park_poll |
CHATCLI_SCHEDULER_HISTORY_LIMIT | Max ExecutionResults kept per job (ring buffer) | 16 |
CHATCLI_PARK_DIR | Override of the @park snapshot directory — useful for tests and isolated environments. See Agent Park & Resume | $XDG_CONFIG_HOME/chatcli/parked |
Default budget (per-job overrides win)
| Variable | Description | Default |
|---|
CHATCLI_SCHEDULER_DEFAULT_ACTION_TIMEOUT | Action timeout per fire | 5m |
CHATCLI_SCHEDULER_DEFAULT_POLL_INTERVAL | Interval between wait-condition polls | 5s |
CHATCLI_SCHEDULER_DEFAULT_WAIT_TIMEOUT | Max wait-loop duration | 30m |
CHATCLI_SCHEDULER_DEFAULT_MAX_POLLS | Max polls before giving up (0 = unlimited) | 0 |
CHATCLI_SCHEDULER_DEFAULT_BACKOFF_INITIAL | Initial backoff between retries | 1s |
CHATCLI_SCHEDULER_DEFAULT_BACKOFF_MAX | Exponential backoff cap | 5m |
CHATCLI_SCHEDULER_DEFAULT_BACKOFF_MULT | Backoff multiplicative factor | 2.0 |
CHATCLI_SCHEDULER_DEFAULT_BACKOFF_JITTER | Jitter fraction (0.0–0.5) | 0.2 |
CHATCLI_SCHEDULER_DEFAULT_MAX_RETRIES | Retries on transient failure | 3 |
CHATCLI_SCHEDULER_DEFAULT_TTL | TTL for terminal records on disk | 24h |
Safety
| Variable | Description | Default |
|---|
CHATCLI_SCHEDULER_RATE_LIMIT_GLOBAL_RPS | Global token bucket (req/s) | 5.0 |
CHATCLI_SCHEDULER_RATE_LIMIT_GLOBAL_BURST | Global bucket burst | 20 |
CHATCLI_SCHEDULER_RATE_LIMIT_OWNER_RPS | Per-owner token bucket | 1.0 |
CHATCLI_SCHEDULER_RATE_LIMIT_OWNER_BURST | Per-owner bucket burst | 10 |
CHATCLI_SCHEDULER_BREAKER_FAILURE_THRESHOLD | Consecutive failures before breaker opens | 5 |
CHATCLI_SCHEDULER_BREAKER_WINDOW | Breaker count window | 60s |
CHATCLI_SCHEDULER_BREAKER_COOLDOWN | Cooldown before half-open probe | 30s |
CHATCLI_SCHEDULER_SHELL_ALLOW_BYPASS | When true, jobs may set bypass_safety=true in the action payload to skip both the preflight and the fire-time re-check. Without this variable, bypass_safety=true is rejected. Shell actions normally pass through CoderMode preflight (ClassifyShellCommand) on enqueue and re-check in RunShell at fire — this env is the escape hatch for trusted CI. | false |
Audit log
| Variable | Description | Default |
|---|
CHATCLI_SCHEDULER_AUDIT_ENABLED | Write JSONL to <data_dir>/audit.log | true |
CHATCLI_SCHEDULER_AUDIT_MAX_SIZE_MB | Size rotation (lumberjack) | 10 |
CHATCLI_SCHEDULER_AUDIT_MAX_BACKUPS | Backups kept | 7 |
CHATCLI_SCHEDULER_AUDIT_MAX_AGE_DAYS | Max age in days | 30 |
Daemon
| Variable | Description | Default |
|---|
CHATCLI_SCHEDULER_DAEMON_SOCKET | UNIX socket path | /tmp/chatcli-scheduler.sock |
CHATCLI_SCHEDULER_DAEMON_AUTO_CONNECT | CLI auto-detects daemon and becomes thin client | true |
CHATCLI_SCHEDULER_SNAPSHOT_INTERVAL | Snapshot periodicity | 5m |
CHATCLI_SCHEDULER_WAL_GC_INTERVAL | GC periodicity for expired terminals | 1h |
Metrics and Observability
| Variable | Description | Default |
|---|
CHATCLI_METRICS_PORT | HTTP port to export Prometheus metrics (0 = disabled) | 9090 |
PROMETHEUS_URL | Prometheus URL for metrics collection during AIOps incident analysis. When set, the operator queries CPU/memory/latency/error rate trends correlated with incidents. Example: http://prometheus-server.monitoring.svc:9090 | (empty = disabled) |
Security
| Variable | Description | Default |
|---|
CHATCLI_SAFETY_ENABLED | Enable configurable safety rules | false |
CHATCLI_GRPC_REFLECTION | Enable gRPC reflection on server (use only in dev) | false |
CHATCLI_DISABLE_VERSION_CHECK | Disable automatic version check | false |
CHATCLI_LATEST_VERSION_URL | Custom URL for version check | GitHub API |
Context and memory threat scan
Sanitizes injected prompt content — contexts attached via /context attach and long-term memory — against prompt-injection before sending it to the model.
| Variable | Description | Default |
|---|
CHATCLI_THREATSCAN | Toggle the context/memory threat scan. On by default; accepts false/0/off/no to disable. | true (on) |
Server Security
| Variable | Description | Default |
|---|
CHATCLI_JWT_SECRET | JWT signing secret (HS256) or path to RSA key (RS256) | "" |
CHATCLI_JWT_ISSUER | Expected JWT issuer claim | "" |
CHATCLI_JWT_AUDIENCE | Expected JWT audience claim | "" |
CHATCLI_RATE_LIMIT_RPS | Rate limit: sustained requests per second | 10 |
CHATCLI_RATE_LIMIT_BURST | Rate limit: burst capacity | 20 |
CHATCLI_MAX_RECV_MSG_SIZE | Maximum gRPC receive message size (bytes) | 4194304 |
CHATCLI_MAX_SEND_MSG_SIZE | Maximum gRPC send message size (bytes) | 4194304 |
CHATCLI_MAX_CONCURRENT_STREAMS | Maximum concurrent gRPC streams per connection | 100 |
CHATCLI_BIND_ADDRESS | Network interface to bind to. Defaults to 127.0.0.1 (local); auto-detects 0.0.0.0 in Kubernetes via KUBERNETES_SERVICE_HOST. | 127.0.0.1 / 0.0.0.0 (K8s) |
CHATCLI_AUDIT_LOG_PATH | Path for the structured audit log file. Must be absolute — relative values are rejected with a logged error. | "" |
CHATCLI_LOG_FILE | Application log file path | ~/.chatcli/app.log |
CHATCLI_LOG_MAX_SIZE_MB | Max log file size before rotation (MB) | 100 |
CHATCLI_LOG_MAX_BACKUPS | Number of rotated log files to keep | 3 |
CHATCLI_LOG_MAX_AGE_DAYS | Maximum age of rotated log files (days) | 30 |
CHATCLI_DEBUG | Enable debug mode with verbose logging | false |
CHATCLI_ALLOW_HTTP_PROVIDERS | Allow HTTP (non-TLS) connections to LLM providers | false |
Agent Security
| Variable | Description | Default |
|---|
CHATCLI_AGENT_SECURITY_MODE | Security mode: strict (allowlist) or permissive (denylist) | permissive |
CHATCLI_AGENT_ALLOWLIST | Extra allowed commands in strict mode (semicolon-separated) | "" |
CHATCLI_AGENT_WORKSPACE_STRICT | Restrict file access to workspace directory only | false |
CHATCLI_AGENT_ALLOW_KUBECONFIG | Allow agent commands to access kubeconfig | false |
CHATCLI_AGENT_EXTRA_READ_PATHS | Additional allowed read paths (semicolon-separated) | "" |
CHATCLI_AGENT_SOURCE_SHELL_CONFIG | Source shell config (~/.bashrc, etc.) in agent commands | false |
CHATCLI_MAX_COMMAND_OUTPUT | Maximum command output size (bytes) | 65536 |
Plugin and Auth Security
| Variable | Description | Default |
|---|
CHATCLI_ALLOW_UNSIGNED_PLUGINS | Allow unsigned plugins to execute | false |
CHATCLI_ALLOW_INSECURE | Allow insecure (non-TLS) connections | false |
CHATCLI_TLS_CLIENT_CERT | Client TLS certificate for mTLS | "" |
CHATCLI_TLS_CLIENT_KEY | Client TLS key for mTLS | "" |
CHATCLI_ENCRYPTION_KEY | Custom encryption key for session data | Auto-generated |
CHATCLI_KEYCHAIN_BACKEND | Keychain backend: auto, file, keychain | auto |
CHATCLI_DISABLE_HISTORY | Disable conversation history recording | false |
CHATCLI_SESSION_TTL | Session time-to-live before expiration | 24h |
CHATCLI_ENV_REDACT_MODE | Env redaction: full, partial, none | full |
CHATCLI_REDACT_PATTERNS | Custom redaction patterns (semicolon-separated regex) | "" |
Operator Security
The operator REST API authentication loads API keys with hot-reload every 30s in the following priority order:
Secret chatcli-operator-secrets → ConfigMap chatcli-operator-config → reject (or accept in dev-mode if CHATCLI_OPERATOR_DEV_MODE=true).
| Variable | Description | Default |
|---|
CHATCLI_OPERATOR_DEV_MODE | Enable operator dev mode (relaxed security) | false |
CHATCLI_AIOPS_TLS_CERT | AIOps REST API TLS certificate | "" |
CHATCLI_AIOPS_TLS_KEY | AIOps REST API TLS key | "" |
CHATCLI_GRPC_TLS_CERT | gRPC TLS certificate | "" |
CHATCLI_GRPC_TLS_KEY | gRPC TLS key | "" |
CHATCLI_GRPC_TLS_CA | gRPC CA certificate for mTLS verification. Must be an absolute path (relative values return an error). In the operator this is only a fallback — when using the Instance CR, WatcherBridge reads ca.crt from the Secret referenced by spec.server.tls.secretName automatically (see cookbook §2.1). | "" |
CHATCLI_ALLOWED_RESOURCE_TYPES | Kubernetes resource types allowed by ApplyManifest (comma-separated) | Deployment,StatefulSet,DaemonSet,… |
CHATCLI_ALLOWED_DIAGNOSTIC_COMMANDS | Additional commands accepted by ExecDiagnostic (comma-separated, extends the ~90-entry default allowlist — see k8s-operator#execdiagnostic-allowlist) | "" (defaults only) |
CHATCLI_OPERATOR_APP_VERSION | Injected by the Helm chart from .Chart.AppVersion. The operator reads it to resolve the server image tag when Instance.spec.image.tag is omitted — letting helm upgrade of the operator also roll Instances. Do not set manually. | Set by chart |
CHATCLI_LOG_SCRUB_PATTERNS | Patterns to scrub from logs (semicolon-separated regex) | Built-in patterns |
OAuth
| Variable | Description | Default |
|---|
CHATCLI_OPENAI_CLIENT_ID | Override OpenAI OAuth client ID | — |
Remote Server
| Variable | Description | Default |
|---|
CHATCLI_SERVER_PORT | gRPC server port | 50051 |
CHATCLI_SERVER_TOKEN | Authentication token | — |
CHATCLI_SERVER_TLS_CERT | TLS certificate path | — |
CHATCLI_SERVER_TLS_KEY | TLS key path | — |
Remote Client
| Variable | Description | Default |
|---|
CHATCLI_REMOTE_ADDR | Remote server address | — |
CHATCLI_REMOTE_TOKEN | Authentication token | — |
CHATCLI_CLIENT_API_KEY | Your API key (sent to server) | — |
Chat Gateway (Telegram / Slack / Discord / WhatsApp / Webhook)
Bridge that exposes ChatCLI as a bot/service on messaging platforms. Started with /gateway start. Each inbound message runs through the real agent loop (tools, shell, file edits — auto-executed) and the progress is streamed back to the chat. Each adapter only activates when its required variables are present — set only the channels you want. See Chat Gateway.
Telegram
| Variable | Description | Default |
|---|
CHATCLI_TELEGRAM_BOT_TOKEN | Bot token (BotFather). Required to enable the Telegram adapter (long-polling via getUpdates). | — |
CHATCLI_TELEGRAM_ALLOWED_USERS | Allowed user IDs, separated by comma/space/;. Empty = any user. | — (all) |
Slack
| Variable | Description | Default |
|---|
CHATCLI_SLACK_BOT_TOKEN | Bot token (xoxb-…) used to reply. Required. | — |
CHATCLI_SLACK_ADDR | Bind address of the events HTTP server (e.g. :8081). Required — the adapter only starts when set. | — |
CHATCLI_SLACK_SIGNING_SECRET | Signing secret to verify the HMAC signature of events (Events API). | — |
CHATCLI_SLACK_PATH | Path of the events endpoint. | /slack/events |
Discord
| Variable | Description | Default |
|---|
CHATCLI_DISCORD_BOT_TOKEN | Bot token. Required to enable the Discord adapter (Gateway WebSocket v10). | — |
WhatsApp (Cloud API)
| Variable | Description | Default |
|---|
CHATCLI_WHATSAPP_ACCESS_TOKEN | WhatsApp Cloud API access token. Required. | — |
CHATCLI_WHATSAPP_PHONE_ID | Phone number ID used to send replies. Required. | — |
CHATCLI_WHATSAPP_ADDR | Bind address of the webhook server (e.g. :8082). Required. | — |
CHATCLI_WHATSAPP_VERIFY_TOKEN | Webhook verification token (Meta’s GET handshake). | — |
CHATCLI_WHATSAPP_PATH | Webhook path. | /whatsapp/webhook |
Generic webhook
| Variable | Description | Default |
|---|
CHATCLI_WEBHOOK_ADDR | Bind address of the generic webhook server (e.g. :8083). Required — the adapter only starts when set. | — |
CHATCLI_WEBHOOK_PATH | Path that receives inbound POSTs. | /inbound |
CHATCLI_WEBHOOK_SECRET | Shared secret, validated in constant time. Empty = no verification. | — |
CHATCLI_WEBHOOK_CALLBACK_URL | URL the reply is POSTed back to. Empty = synchronous reply on the request. | — |
Voice / transcription (audio on channels)
Transcribes voice notes to text before the pipeline. Local-first selection: command → self-hosted URL → whisper CLI on PATH (auto, downloads the model) → Groq → OpenAI → disabled. See Chat Gateway → Voice messages.
| Variable | Description | Default |
|---|
CHATCLI_TRANSCRIPTION_PROVIDER | Pin the STT backend: embedded, command, url, groq, openai (or auto). In auto, with nothing configured, it falls back to the embedded Whisper (one-time download at daemon startup). | auto |
CHATCLI_TRANSCRIPTION_CMD | Local STT command (placeholders {input}/{output_dir}/{lang}); reads the transcript from stdout or the generated .txt. | — |
CHATCLI_TRANSCRIPTION_URL | Self-hosted OpenAI-compatible endpoint (/v1). Keyless. | — |
CHATCLI_TRANSCRIPTION_KEY | Optional key for the self-hosted endpoint. | — |
CHATCLI_TRANSCRIPTION_MODEL | Model (embedded: tiny/base/small/medium/large-v3; cloud: e.g. whisper-1; local whisper: base/small). | whisper-1 (cloud) / base (embedded) |
CHATCLI_TRANSCRIPTION_LANG | Forced language; empty = auto-detect the spoken language. | (auto) |
CHATCLI_TRANSCRIPTION_CACHE_DIR | Relocates the embedded engine cache (absolute path; default ~/.cache/chatcli/stt/). | (os cache dir) |
CHATCLI_GATEWAY_MAX_AUDIO_BYTES | Max downloaded audio size. | 20971520 (20MB) |
Voice notes are OGG/Opus; local whisper.cpp and the embedded engine need ffmpeg to decode them (cloud/self-hosted backends decode server-side). With nothing configured, the gateway uses the embedded Whisper automatically — only platforms without a prebuilt engine get the configuration hint.
Proactive messaging (@send)
Home channels for the @send tool when the target is just the platform.
| Variable | Description | Default |
|---|
CHATCLI_TELEGRAM_HOME_CHANNEL | Default Telegram channel. | — |
CHATCLI_WHATSAPP_HOME_CHANNEL | Default WhatsApp channel. | — |
CHATCLI_DISCORD_HOME_CHANNEL | Default Discord channel. | — |
CHATCLI_SLACK_HOME_CHANNEL | Default Slack channel. | — |
CHATCLI_WEBHOOK_HOME_CHANNEL | Default webhook target. | — |
Voice reply / TTS
Synthesizes the reply to audio. Local-first: command → self-hosted URL → embedded (if provisioned) → say/espeak on PATH → OpenAI → Groq → Gemini → disabled. See Text-to-Speech and Voice Replies.
| Variable | Description | Default |
|---|
CHATCLI_GATEWAY_VOICE_REPLY | Gateway voice reply mode: auto (voice answers voice), always, never. Legacy booleans work (true→always, false→never). | auto |
CHATCLI_TTS_PROVIDER | Pin the backend: embedded (or kokoro)/command/url/openai/groq/google (or auto). | auto |
CHATCLI_TTS_CMD | Local TTS command (placeholders {text}/{output}). Keyless. | — |
CHATCLI_TTS_CMD_EXT | Output extension for CHATCLI_TTS_CMD. | wav |
CHATCLI_TTS_URL | OpenAI-compatible endpoint (/audio/speech). | — |
CHATCLI_TTS_KEY | Optional key for the self-hosted endpoint. | — |
CHATCLI_TTS_MODEL | TTS model. | tts-1 |
CHATCLI_TTS_VOICE | Voice (backend-dependent). Embedded: voice for English. | alloy (embedded: bm_george) |
CHATCLI_TTS_VOICE_PT | Embedded-engine voice for Portuguese replies. | pm_alex |
CHATCLI_TTS_CACHE_DIR | Relocates the embedded engine/model cache. Absolute path. | OS cache dir |
CHATCLI_TTS_VOICE_FORMAT | Format requested for the gateway reply (ogg = Telegram voice note). | ogg |
Image generation (@image)
Generates images from text. Local-first: SD WebUI → OpenAI-compatible URL → OpenAI → Google Imagen → xAI grok-image → Bedrock. See Image Generation.
| Variable | Description | Default |
|---|
CHATCLI_IMAGE_PROVIDER | Pin the backend: sdwebui/url/openai/responses/google/xai/bedrock (or auto). | auto |
CHATCLI_IMAGE_API | For OpenAI: images (gpt-image-*) or responses (a chat model like gpt-5.5 generates via the image_generation tool). | images |
CHATCLI_IMAGE_URL | SD WebUI (http://localhost:7860) or an OpenAI-compatible endpoint. | — |
CHATCLI_IMAGE_KEY | Optional key for the self-hosted endpoint. | — |
CHATCLI_IMAGE_MODEL | Model (e.g. gpt-image-1, gpt-5.5, grok-2-image, imagen-3.0-generate-002, amazon.nova-canvas-v1:0). | gpt-image-1 |
CHATCLI_IMAGE_STEPS | Sampling steps (SD WebUI). | 25 |
xAI (image) and Groq (voice) use their standard keys (XAI_API_KEY, GROQ_API_KEY); Google uses GOOGLEAI_API_KEY/GEMINI_API_KEY; Bedrock uses the chat provider’s AWS credential chain (BEDROCK_REGION/AWS_REGION, BEDROCK_PROFILE/AWS_PROFILE). The @osv, @session and @skill tools need no variables. Use @image models or /config image to view/switch backend and model at runtime.
Conversation Hub (cross-channel continuity)
Carries a conversation across channels: a topic started on Telegram/Slack/WhatsApp continues in the notebook chatcli (and vice-versa), until /newsession. It is a momentary bridge with a bounded database — not long-term memory (that lives in /memory and /session). See Conversation Hub.
Every option below can also be changed at runtime by command (/config hub set <key> <value>), with precedence: setting (in hub.db) > environment variable > default. A value set by command persists in the database and is read live by the gateway daemon.
| Variable | Description | Default |
|---|
CHATCLI_HUB_PRINCIPAL | Shared conversation identity (single-user mode). Unset, it falls back to default — the local CLI and the gateway’s unbound senders collapse to it, so it works with zero configuration. | default |
CHATCLI_HUB_ENABLED | Enable the hub. false disables it (local CLI “starts fresh”; gateway has no continuity). | true |
CHATCLI_HUB_ISOLATE | Multi-user/public bot: true keeps each channel identity in its own conversation (one user never sees another’s thread). Off in the single-user default. | false |
CHATCLI_HUB_TTL_HOURS | Hours before an idle conversation is pruned (PurgeIdle runs when the CLI/gateway opens). 0 disables purging. | 24 |
CHATCLI_HUB_BINDINGS | Explicit platform:user_id=principal bindings, separated by ;/, (e.g. telegram:123=alice;slack:U1=alice). Take precedence over the single-user collapse. | — |
CHATCLI_HUB_DB | Path to the hub’s SQLite (WAL) database. | ~/.chatcli/hub.db |
CHATCLI_HUB_TAIL_BUFFER | Per-subscriber fan-out (live tail) buffer size on the gRPC server. | 256 |
CHATCLI_GATEWAY_IN_SERVER | true runs the gateway inside the server process (chatcli server), sharing one in-memory broker — enables real-time push to a connected CLI (cross-process only syncs on connect / next turn). | false |
To “run chatcli + the gateway on one machine and share context”: just CHATCLI_HUB_PRINCIPAL (or the default) — no bindings needed. For a multi-user bot, turn on CHATCLI_HUB_ISOLATE=true and use CHATCLI_HUB_BINDINGS to map who is who.
LSP (Language Server Protocol — diagnostics)
Code diagnostics (compiler/linter errors and warnings) for a file, via LSP servers. Triggered manually by the /lsp <file> command. Each variable overrides the command used to start the language server for its language; when unset, ChatCLI uses the default preset (if the binary is on PATH). See LSP Diagnostics.
| Variable | Description | Default (preset) |
|---|
CHATCLI_LSP_GO_CMD | Go language server command. | gopls |
CHATCLI_LSP_PYTHON_CMD | Python language server command. | pyright-langserver --stdio |
CHATCLI_LSP_TS_CMD | TypeScript/JavaScript language server command. | typescript-language-server --stdio |
CHATCLI_LSP_RUST_CMD | Rust language server command. | rust-analyzer |
CHATCLI_LSP_C_CMD | C language server command. | clangd |
CHATCLI_LSP_CPP_CMD | C++ language server command. | clangd |
CHATCLI_LSP_JAVA_CMD | Java language server command. | jdtls |
CHATCLI_LSP_RUBY_CMD | Ruby language server command. | solargraph stdio |
Web Search
| Variable | Description | Default |
|---|
CHATCLI_WEBSEARCH_PROVIDER | Preferred backend for @websearch / /websearch: searxng, duckduckgo, brave, mojeek, or auto. | auto |
SEARXNG_URL | Root URL of the self-hosted SearxNG instance (e.g. https://searx.internal.corp). When set, SearxNG joins the fallback chain. | — |
Backends are keyless by design: DuckDuckGo (HTML scraping, default) + SearxNG (self-hosted via SEARXNG_URL). See Web Tools for the fallback chain.
K8s Watcher
| Variable | Description | Default |
|---|
CHATCLI_WATCH_DEPLOYMENT | Single deployment (legacy) | — |
CHATCLI_WATCH_NAMESPACE | Deployment namespace | default |
CHATCLI_WATCH_INTERVAL | Collection interval | 30s |
CHATCLI_WATCH_WINDOW | Observation window | 2h |
CHATCLI_WATCH_MAX_LOG_LINES | Maximum log lines per pod | 100 |
CHATCLI_WATCH_CONFIG | Path to multi-target YAML config | — |
CHATCLI_KUBECONFIG | Kubeconfig path | Auto-detected |
Complete .env Example
# Geral
LOG_LEVEL=info
CHATCLI_LANG=pt-BR
ENV=prod
LLM_PROVIDER=CLAUDEAI
# Provedor principal
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxxxxxxxxxxxxx
ANTHROPIC_MODEL=claude-sonnet-4-6
ANTHROPIC_MAX_TOKENS=20000
# Fallback
CHATCLI_FALLBACK_PROVIDERS=CLAUDEAI,OPENAI,GOOGLEAI,ZAI,MINIMAX,MOONSHOT,OPENROUTER
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxx
GOOGLEAI_API_KEY=AIzaxxxxxxxxxxxxxxxxxxxxxxxx
ZAI_API_KEY=your-zai-api-key
MINIMAX_API_KEY=your-minimax-api-key
MOONSHOT_API_KEY=sk-your-moonshot-api-key
# MOONSHOT_MODEL=kimi-k2.6
# MOONSHOT_THINKING=auto # auto | enabled | disabled
OPENROUTER_API_KEY=sk-or-xxxxxxxxxxxxxxxxxxxxxxxx
# MINIMAX_API_COMPAT=anthropic # use Anthropic-compatible endpoint
# Agente
CHATCLI_AGENT_CMD_TIMEOUT=2m
CHATCLI_AGENT_ALLOW_SUDO=false
# Multi-Agent
CHATCLI_AGENT_PARALLEL_MODE=true
CHATCLI_AGENT_MAX_WORKERS=4
# Bootstrap and Memory
CHATCLI_BOOTSTRAP_ENABLED=true
CHATCLI_MEMORY_ENABLED=true
# Corporate proxy — uncomment and tune if your environment caps body size
# CHATCLI_MAX_PAYLOAD=5MB
# CHATCLI_BLOCK_TMP_WRITES=true # strict sandbox, blocks /tmp in allowlist
# Corporate TLS inspection (private CA) — global, applies to all providers and tools
# CHATCLI_CA_BUNDLE=/etc/ssl/corp-ca-bundle.pem
# CHATCLI_TLS_INSECURE_SKIP_VERIFY=true # INSECURE — diagnosis only, never production
# Context window — uncomment if your gateway/agent's real window differs from the catalog
# CHATCLI_CONTEXT_WINDOW=128000
# Web search — uncomment to prefer self-hosted SearxNG
# SEARXNG_URL=https://searx.internal.corp
# CHATCLI_WEBSEARCH_PROVIDER=searxng # optional, default "auto" (DDG first)
# Mixture-of-Agents (/moa) — uncomment to enable the ensemble
# CHATCLI_MOA_MODELS=openai:gpt-5,claudeai:claude-opus-4-8,googleai:gemini-2.5-pro
# CHATCLI_MOA_AGGREGATOR=claudeai:claude-opus-4-8
# Chat Gateway (/gateway start) — uncomment only the channels you use
# CHATCLI_TELEGRAM_BOT_TOKEN=123:abc
# CHATCLI_TELEGRAM_ALLOWED_USERS=111111111,222222222
# CHATCLI_SLACK_BOT_TOKEN=xoxb-xxx
# CHATCLI_SLACK_ADDR=:8081
# CHATCLI_SLACK_SIGNING_SECRET=xxxx
# CHATCLI_DISCORD_BOT_TOKEN=xxxx
# CHATCLI_WHATSAPP_ACCESS_TOKEN=xxxx
# CHATCLI_WHATSAPP_PHONE_ID=123456
# CHATCLI_WHATSAPP_ADDR=:8082
# CHATCLI_WHATSAPP_VERIFY_TOKEN=my-verify
# CHATCLI_WEBHOOK_ADDR=:8083
# CHATCLI_WEBHOOK_SECRET=supersecret
# CHATCLI_WEBHOOK_CALLBACK_URL=https://myapp.example/callback
# Conversation Hub (cross-channel continuity) — on by default
# CHATCLI_HUB_PRINCIPAL=edilson # shared identity (default: "default")
# CHATCLI_HUB_ISOLATE=true # multi-user bot: isolate each channel
# CHATCLI_HUB_TTL_HOURS=24 # prune idle conversations (0 disables)
# CHATCLI_HUB_BINDINGS=telegram:123=edilson;slack:U1=edilson
# CHATCLI_GATEWAY_IN_SERVER=true # gateway in the server process (real-time push)
# CHATCLI_HUB_ENABLED=false # disable the hub entirely
# LSP (/lsp) — uncomment only to override the default command
# CHATCLI_LSP_GO_CMD=gopls -rpc.trace
# CHATCLI_LSP_PYTHON_CMD=pyright-langserver --stdio
# Context/memory security and tool guard — on by default
# CHATCLI_THREATSCAN=false # disable prompt-injection scan of context/memory
# CHATCLI_AGENT_TOOLGUARD=false # disable the repeated tool-failure hint