StreamingClient Interface
Streaming is implemented as an optional interface that providers can adopt:StreamingClient receive streaming automatically:
Providers that do not implement
StreamingClient continue to work normally. ChatCLI falls back to SendPrompt (complete response) automatically.StreamChunk
Each streaming chunk carries:| Field | Type | Description |
|---|---|---|
Text | string | Incremental text in this chunk (may be empty) |
Done | bool | true on the final chunk |
Usage | *UsageInfo | Token usage data (only on the final chunk) |
StopReason | string | Stop reason: end_turn, max_tokens, tool_use |
Error | error | Error during streaming (terminates the stream) |
Streaming Contract
- The channel returns zero or more text chunks
- The final chunk has
Done=trueand may includeUsageandStopReason - If an error occurs, a chunk with
Erroris sent and the channel closes - The channel closes after the final chunk or error
- The caller can cancel via context
Supported Providers
| Provider | Streaming | Notes |
|---|---|---|
| Anthropic (API Key) | Yes | Native streaming via Messages API |
| Anthropic (OAuth) | Yes | Streaming via OAuth token |
| OpenAI | Yes | Streaming via Chat Completions |
| ZAI (Zhipu AI) | Yes | OpenAI-compatible streaming |
| MiniMax | Yes | OpenAI-compatible streaming |
| OpenRouter | Yes | Streaming via OpenAI-compatible API |
| Google (Gemini) | No | Fallback to complete response |
| xAI (Grok) | No | Fallback to complete response |
| GitHub Models | No | Fallback to complete response |
| Ollama | No | Fallback to complete response |
Stream Watchdog
The Stream Watchdog monitors the stream to detect stalls (interruptions without data) and prevent ChatCLI from hanging indefinitely:- Timeouts
- Result
| Timer | Duration | Action |
|---|---|---|
| Warning | 45 seconds | Logs a stall warning |
| Idle Timeout | 90 seconds | Aborts stream and returns partial content |
Watchdog Configuration
| Environment Variable | Description | Default |
|---|---|---|
CHATCLI_STREAM_IDLE_TIMEOUT_SECONDS | Idle timeout in seconds | 90 |
Fallback to Non-Streaming
When streaming is not available (provider does not support it or connection error), ChatCLI falls back automatically:DrainStream function allows converting a stream into a complete response when needed:
TUI Integration
In interactive mode (Bubble Tea), streaming integrates directly with the renderer:- Each chunk is emitted as an event via
TUIEmitter - The Bubble Tea model updates the view incrementally
- Markdown is rendered progressively via Glamour
- The status bar shows the streaming state in real time
In one-shot mode (
-p), streaming is disabled and DrainStream is used to collect the complete response before printing.Next Steps
Context Recovery
What happens when max_tokens is reached during streaming.
Provider Fallback
Fallback chain between providers with and without streaming.
Native Tool Use
Streaming with native tool calls.
Progress UI
Visual indicators during agent streaming.