Tool Result Management

ChatCLI implements a comprehensive tool result management system that ensures integrity, controls context size, and progressively compacts old results. This is essential for long agent sessions where dozens of tool calls can saturate the context window.

Tool Result Pairing

Every tool_use (a tool call made by the model) must have a corresponding tool_result in the conversation history. When this pairing breaks — due to interruption, timeout, or silent error — the API rejects the history. The EnsureToolResultPairing system automatically validates and repairs:

Problem	Repair Action
`tool_use` without `tool_result` (orphan)	Injects synthetic error result
`tool_result` without `tool_use` (orphan)	Removes from history
Duplicate `tool_use` IDs	Keeps only the first occurrence

Synthetic Results

When a tool_use has no corresponding result, ChatCLI injects:

[Tool result missing -- the tool execution was interrupted or failed silently.
Do NOT retry this tool call. Analyze what went wrong and try a different approach.]

The message instructs the model to not repeat the failed tool call, preventing infinite retry loops.

3-Phase Validation

ID Collection

Traverses the entire history collecting tool_use IDs (from assistant messages) and tool_result IDs (from tool messages).

Misalignment Detection

Compares the two sets of IDs. Tool uses without a result are “missing”. Tool results without a use are “orphans”. Duplicate IDs are flagged for deduplication.

History Reconstruction

Rebuilds the history: removes orphans, deduplicates tool_use IDs, and injects synthetic results after assistant messages with unmatched tool_uses.

Result Budget Enforcement

Tool results such as large file reads or command output can quickly consume the context window. The budget system limits the aggregate size at three levels:

Level	Limit	Variable / Override	Default
Per-tool (new)	Cap applied inside the dispatch before aggregation	`TruncationAware.MaxResultChars()` capability per plugin	30,000 chars (global)
Per result	Maximum size of a single result	`CHATCLI_TOOL_RESULT_MAX_CHARS`	20,000 chars
Per turn	Aggregate size of all results in the turn	`CHATCLI_TOOL_RESULT_BUDGET_CHARS`	200,000 chars

Per-tool truncation (capability)

Plugins that implement plugins.TruncationAware declare their own cap — useful when the tool has non-default context needs:

Plugin	Cap	Rationale
`@read`	80,000	Large files (~1500 lines) are the primary code-learning surface; a low cap blinds the model
`@search`	60,000	Breadth-oriented structured output (file:line:match) — the model needs reach
`@tree`	50,000	Monorepo listings easily exceed 30k
Other plugins	30,000	Global default

Truncation preserves the historical head/tail shape (5000-char preview + 1000-char suffix + [TRUNCATED N chars omitted, M kept] marker).

How Enforcement Works

The budget is applied in two passes:

Pass 1: Per Result
Pass 2: Per Turn

Each individual result is checked against DefaultPerResultMaxChars (20KB). If it exceeds the limit, the full content is saved to disk and replaced with a preview:

[first 4,000 chars of the result]

... [85,432 chars omitted -- full output saved to /tmp/chatcli-tool-results/budget_tc_1_0.txt]

[last 1,000 chars of the result]

If the total of all results in the turn exceeds DefaultTurnBudgetChars (200KB), the largest results are progressively truncated (from largest to smallest) until the turn fits within the budget.

Disk Persistence

Truncated results are saved as temporary files inside the Session Workspace, instead of the legacy global /tmp/chatcli-tool-results/:

$TMPDIR/chatcli-agent-<random>/tool-results/
  budget_tc_1_0_1.txt    # Full result from tool call 1
  budget_tc_2_3_2.txt    # Full result from tool call 2
  result_read_3.txt      # Secondary storage (workers package)

Moving to a per-session directory has two important effects:

Isolation between sessions. Multiple chatcli instances running in parallel on the same host no longer share the overflow pool.
On-demand reads by the agent. The scratch dir is on the agent’s read allowlist, so when the model encounters the [full output saved to ...] marker in the preview, it can open the file with read_file:

<tool_call name="@coder" args='{"cmd":"read","args":{"file":"/tmp/chatcli-agent-Xy7K3a/tool-results/budget_tc_3_1.txt","start":1200,"end":1500}}' />

Before this release, the path was a dead end — the read tool blocked it because it was outside the workspace boundary. The budget “saved the output” but the agent had no way to access it.

Files are automatically cleaned up when the session ends (ChatCLI.cleanup), respecting CHATCLI_AGENT_KEEP_TMPDIR=true for debugging. A periodic global cleanup is no longer required.

Preview: Head + Tail

The preview retains the beginning and end of the result to maximize usefulness:

Component	Size
Head (beginning)	4,000 chars (cuts at the last line break)
Reference	File path on disk
Tail (end)	1,000 chars (cuts at the first line break)

Progressive Microcompaction

Microcompaction progressively reduces the size of old tool results as the conversation advances, without losing critical information:

Result Age	Action	Details
Current and previous turn	No change	Results preserved in full
2+ turns ago	Truncated	Head (2,000 chars) + tail (500 chars)
4+ turns ago	Summarized	One-line description: `[Old tool result cleared -- 450 lines, 28K chars, Go source]`

Content Type Detection

The summary automatically identifies the content type for context:

Content	Detected Type
Starts with `{` or `[`	JSON
Contains `package`	Go source
Contains `def`	Python source
Contains `function`	JavaScript source
Starts with `diff` or `---`	diff
Starts with `commit`	git log
Other	text

Microcompaction Configuration

Variable	Description	Default
`CHATCLI_MICROCOMPACT_TRUNCATE_TURNS`	Turns before truncating	2
`CHATCLI_MICROCOMPACT_SUMMARIZE_TURNS`	Turns before summarizing	4

Only results larger than 3,000 chars are compacted. Small results are always preserved. Results from write and execution tools are preserved as they contain critical error information.

Complete Flow

Tool result management is applied in this order during the agent loop:

Tool executes and returns result
EnsureToolResultPairing -> fixes misalignments
EnforceToolResultBudget -> truncates large results
ApplyMicrocompact -> compacts old results
Clean history sent to the API

Complete Configuration

Environment Variable	Description	Default
`CHATCLI_TOOL_RESULT_BUDGET_CHARS`	Aggregate budget per turn	200,000
`CHATCLI_TOOL_RESULT_MAX_CHARS`	Maximum size per result	20,000
`CHATCLI_MICROCOMPACT_TRUNCATE_TURNS`	Turns before truncation starts	2
`CHATCLI_MICROCOMPACT_SUMMARIZE_TURNS`	Turns before summarization starts	4

Next Steps

Session Workspace

Where overflow files live and how the agent reads them.

Subagent Delegation

Complementary strategy to avoid saturating context with raw data.

Context Recovery

What happens when even with budgeting the context overflows.

Cost Tracking

Monitor token consumption including tool results.

​Tool Result Pairing

​Synthetic Results

​3-Phase Validation

​Result Budget Enforcement

​Per-tool truncation (capability)

​How Enforcement Works

​Disk Persistence

​Preview: Head + Tail

​Progressive Microcompaction

​Content Type Detection

​Microcompaction Configuration

​Complete Flow

​Complete Configuration

​Next Steps

Session Workspace

Subagent Delegation

Context Recovery

Cost Tracking

Tool Result Pairing

Synthetic Results

3-Phase Validation

Result Budget Enforcement

Per-tool truncation (capability)

How Enforcement Works

Disk Persistence

Preview: Head + Tail

Progressive Microcompaction

Content Type Detection

Microcompaction Configuration

Complete Flow

Complete Configuration

Next Steps