chatcli connect command transforms your local terminal into a client that connects to a remote ChatCLI server. The entire interactive experience (sessions, contexts, agent, coder) works transparently, as if the LLM were running locally.
Basic Connection
All Flags
| Flag | Description | Env Var |
|---|---|---|
--addr <host:port> | Server address | CHATCLI_REMOTE_ADDR |
--token <string> | Authentication token | CHATCLI_REMOTE_TOKEN |
--provider <name> | Overrides the server’s LLM provider | |
--model <name> | Overrides the server’s LLM model | |
--llm-key <string> | Your own API key (sent to the server) | CHATCLI_CLIENT_API_KEY |
--use-local-auth | Uses OAuth credentials from the local auth store | |
--tls | Enables TLS connection | |
--ca-cert <path> | CA certificate for TLS verification | |
-p <prompt> | One-shot mode: send prompt and exit | |
--raw | Raw output (no Markdown/ANSI formatting) | |
--max-tokens <int> | Maximum tokens in response |
StackSpot Flags
| Flag | Description |
|---|---|
--client-id | StackSpot Client ID |
--client-key | StackSpot Client Key |
--realm | StackSpot Realm/Tenant |
--agent-id | StackSpot Agent ID |
Ollama Flags
| Flag | Description |
|---|---|
--ollama-url | Ollama base URL (e.g., http://gpu:11434) |
Credential Modes
You can choose how to authenticate with the LLM provider:- Server Credentials
- Your Own API Key
- Local OAuth
- StackSpot
- Ollama
Do not send any credential flags. The server uses its own API keys:
One-Shot Mode via Connect
Send a single prompt to the remote server and receive the response:Interactive Mode
Without the-p flag, ChatCLI starts the full interactive mode:
- Sessions:
/session save,/session load,/session list - Agent:
/agent <task>or/run <task> - Coder:
/coder <task> - Context:
@file,@git,@command,@env,@history - Persistence:
/context create,/context attach - Switch:
/switchto change provider/model - Watcher:
/watch statusto see K8s Watcher status
Remote Resource Discovery
Upon connecting, the client automatically discovers plugins, agents, and skills available on the server:- Remote Plugins
- Remote Agents and Skills
Server plugins appear in
/plugin list with the [remote] tag. They are executed on the server — the client sends the command via gRPC and receives the result:Hybrid Mode
- Local and remote plugins coexist; the
[remote]prefix indicates the origin - Local and remote agents are listed together; when loading, resolution is transparent
- When disconnecting (
/disconnect), remote resources are automatically removed
Check K8s Watcher Status
If the server has an active K8s Watcher, you can query the status remotely:Environment Variables
Configure default values via environment variables to avoid typing flags every time:TLS and Security
For a complete security guide (authentication, container hardening, RBAC, etc.), see the security documentation.
Load Balancing with Multiple Replicas
When the ChatCLI server runs with multiple replicas in Kubernetes, the client automatically distributes connections across available pods:- The client uses client-side round-robin via gRPC
dns:///resolver - Requires a headless Service (
ClusterIP: None) in Kubernetes - Built-in keepalive (ping every 10s) detects inactive pods and reconnects quickly
- In the Helm chart, enable
service.headless: truewhenreplicaCount > 1 - In the Operator, headless is activated automatically when
spec.replicas > 1