Package Overview
Main Struct: ChatCLI
TheChatCLI struct in cli/cli.go is the heart of the system. Its most important fields are organized by responsibility:
LLM and Provider
| Field | Type | Description |
|---|---|---|
Client | client.LLMClient | Active LLM provider client |
manager | manager.LLMManager | Provider manager (creates clients, lists available) |
Provider | string | Active provider name (e.g., "ANTHROPIC", "OPENAI") |
Model | string | Active model (e.g., "claude-sonnet-4-20250514") |
History and Compaction
| Field | Type | Description |
|---|---|---|
history | []models.Message | Unified conversation history (shared across all modes) |
historyCompactor | *HistoryCompactor | 3-level compaction pipeline |
historyManager | *HistoryManager | Command history persistence to disk |
Execution Control
| Field | Type | Description |
|---|---|---|
isExecuting | atomic.Bool | Atomic flag: true while waiting for LLM response |
operationCancel | context.CancelFunc | Cancels the in-flight LLM operation (Ctrl+C) |
processingDone | chan struct{} | Signals processing completion to the main loop |
interactionState | InteractionState | Interaction state: Normal, SwitchingProvider, Processing, AgentMode |
executionProfile | ExecutionProfile | Execution profile: Normal, Agent, Coder |
Subsystems
| Field | Type | Description |
|---|---|---|
agentMode | *AgentMode | Agent mode instance (ReAct loop) |
pluginManager | *plugins.Manager | Plugin manager (built-in + external) |
commandHandler | *CommandHandler | Slash command router |
contextHandler | *ContextHandler | Context manager (/context) |
personaHandler | *PersonaHandler | Persona/agent manager |
skillHandler | *SkillHandler | Skill registry and execution |
contextBuilder | *workspace.ContextBuilder | Combines bootstrap + memory into system prompt |
memoryStore | *workspace.MemoryStore | Facade for the structured memory system |
memWorker | *memoryWorker | Background memory extraction worker |
Auxiliary State
| Field | Type | Description |
|---|---|---|
checkpoints | []conversationCheckpoint | History restore points (max 20) |
messageQueue | []string | FIFO queue of messages typed during processing (type-ahead) |
lastEscTime | time.Time | Esc+Esc detection for rewind menu |
sessionManager | *SessionManager | Save/load sessions to disk |
Startup Flow
The diagram below shows the complete boot sequence, frommain.go to the interactive loop:
detectProjectDir()
Walks up from the current working directory to the filesystem root looking for project markers:.agent/(explicit ChatCLI marker) — takes priority.git/(common convention)
"" if no marker is found.
Message Flow: Chat Mode
In normal interactive mode, each user message follows this pipeline:Type-ahead (messageQueue)
Messages typed while the LLM is processing are stored in a FIFO queue (messageQueue). After each response, the system drains the queue and processes each message sequentially, eliminating the need to re-type. Coder mode has its own queue with a real-time visual indicator (▼ N msg queue) when messages are pending.
Input guard (anti-typeahead in security prompts)
Distinct from the “good” typeahead above, when a security box appears mid-turn — three layers discard queued input to prevent consuming bytes asy/n:
- TTY flush —
TCIFLUSH(Linux),TIOCFLUSH(BSD/Darwin),FlushConsoleInputBuffer(Windows). - Drain channel — empties the non-blocking reader buffer.
- 250ms debounce — discards input arriving in the first 250ms after the box renders.
stty sane runs on the controlling /dev/tty to recover from a prior go-prompt teardown that may have left the terminal in raw mode.
Message Flow: Agent/Coder Mode
Entering agent mode uses apanic/recover mechanism to exit the go-prompt loop:
Anchor Reminder
At each turn, the agent injects a short reminder into the history to keep the LLM focused on the original task. This prevents drift in long conversations.Tool Call Parsing
The parser incli/agent/toolcall_parser.go uses a stateful scanner (not regex) for robustness:
Supported Formats
Scanner Algorithm
- Search for
<tool_callcase-insensitively - Verify the next char is whitespace or
>(not part of another tag like<tool_caller>) scanTagEnd(): advance forward respecting quotes (single and double)- Inside quotes,
>is treated as literal text - Supports HTML entities (
>,", etc.)
- Inside quotes,
- Extract
nameandargsattributes regardless of order - If self-closing fails, try paired tags with
</tool_call> - Fallback to JSON: try
parseJSONToolCalls()in parallel
Why Not Regex?
Tool arguments frequently contain>, ", and nested JSON. Regex cannot distinguish > inside a quoted attribute from > that closes the tag. The stateful scanner solves this by tracking quote state.
Argument Sanitization Pipeline
After parsing, each tool call goes through a 7-step pipeline inagent_tool_sanitizer.go:
History Compaction (3 Levels)
TheHistoryCompactor in cli/history_compactor.go manages history size through a progressive pipeline:
Agent Mode Trigger
In agent mode, compaction is checked at each turn of the ReAct loop. The trigger fires when token usage exceeds 60% of the model’s budget.Checkpoint/Rewind System
rewind.go implements a history snapshot system:
Structure
Behavior
- When saved: Before each LLM call (
saveCheckpoint()) - Limit: Maximum of 20 checkpoints (FIFO — oldest are discarded)
- Deep copy: Each checkpoint contains a complete, independent copy of the history
- Trigger: Press Esc+Esc (two Esc presses within 500ms) to open the rewind menu
- Restore: The user selects a checkpoint, and the history is replaced with the saved copy
Provider Registry
Auto-registration Pattern
Each LLM provider auto-registers viainit() in the llm/registry package:
ProviderInfo
Discovery Flow
switch/case blocks. To add a new provider, just create a package with register.go and implement the LLMClient interface.
Memory Worker (Background Process)
ThememoryWorker in cli/memory_worker.go extracts memories from the conversation in the background:
Parameters
| Constant | Value | Description |
|---|---|---|
memoryMinNewMessages | 4 | Minimum new messages to trigger |
memoryCooldown | 2 min | Minimum time between extractions |
memoryExtractTimeout | 60s | Timeout for the LLM extraction call |
compactionCheckInterval | 6h | How often to check for compaction |
dailyCleanupInterval | 24h | How often to clean up daily notes |
Triggers
- nudge(): Called after each LLM response. If there are >= 4 new messages and cooldown has expired, runs extraction
- 3-minute ticker: The background loop checks periodically (for long sessions where the user types infrequently)
- Compaction ticker (6h): Consolidates old facts with low scores
- Cleanup ticker (24h): Removes expired daily notes
Extraction Pipeline
Main Components
CLI and Modes
TheChatCLI struct in cli/cli.go is the central point (~923 lines after decomposition). The Start() method initiates interactive mode using Bubble Tea (Charmbracelet). Helper methods, history management, mode switching, output formatting, prompt building, session management, and tool handling have been extracted into dedicated files (cli_helpers.go, cli_history.go, cli_mode.go, cli_output.go, cli_prompt.go, cli_session.go, cli_tools.go).
| Mode | File | Description |
|---|---|---|
| Interactive | cli/cli.go | Interactive prompt with auto-completion |
| Agent | cli/agent_mode.go | Task planning and execution |
| Coder | cli/cli.go + cli/agent_mode.go | Engineering loop with tool calls |
| One-shot | cli/cli.go (flag -p) | Single execution without TUI |
panic/recover mechanism to exit the go-prompt loop.
ChatCLI uses a unified history (cli.history) shared across all modes. When switching modes, the full context is preserved. The /compact and /rewind commands operate directly on this single history.
Message Bus
Thecli/bus package implements a typed message bus with:
- Pub/sub with channel and type filters
- Request-reply with correlation IDs
- Atomic throughput metrics
Multi-Agent
The orchestration system incli/agent/workers manages 12 specialist agents running in parallel goroutines with a configurable semaphore. Each worker has:
- Isolated mini ReAct loop (observe -> reason -> act)
- Own skills (accelerator and descriptive scripts)
- File locks (mutex per-filepath)
- Configurable timeout and max turns
Technologies
| Category | Library |
|---|---|
| TUI | Bubble Tea (Charmbracelet) |
| Markdown | Glamour (Charmbracelet) |
| Colors | Lipgloss (Charmbracelet) |
| Logger | Zap (Uber) |
| Log rotation | Lumberjack |
| Env files | Godotenv |
| i18n | golang.org/x/text |
| gRPC | google.golang.org/grpc |
| Kubernetes | k8s.io/client-go |
| Operator | controller-runtime (Kubebuilder) |
| Protobuf | google.golang.org/protobuf |
Design Patterns
- Auto-registration via
init()— Providers register themselves automatically - Interface-driven —
LLMClientandToolAwareClientfor polymorphism - Fallback chain — Intelligent error classification + exponential cooldown
- Stateful parser — XML parsing with escaped attributes (more robust than regex)
- embed.FS — i18n files embedded in the binary (always uses
/, neverfilepath.Join) - Panic/recover — Mode switching in go-prompt without restarting the process
- Unified history — A single message array shared across chat, agent, and coder modes
- Checkpoint/rewind — Deep copy of history before each LLM call, with selective restore
- 3-level compaction — Near-lossless trimming, structured summarization, emergency truncation
- Type-ahead queue — Messages typed during processing are queued and drained automatically
- Workspace context injection — Bootstrap + memory automatically injected into every system prompt
- Unified system prompt with caching — Attached contexts (
/context attach) flow through the system prompt viaSystemPartswithCacheControlhints, enabling provider-level prompt caching (Anthropiccache_control: ephemeral, OpenAI automatic caching, Google context caching) - Background memory extraction — Worker goroutine extracts and consolidates memories from the conversation without blocking the main flow