Remote Connection (chatcli connect)

The chatcli connect command transforms your local terminal into a client that connects to a remote ChatCLI server. The entire interactive experience (sessions, contexts, agent, coder) works transparently, as if the LLM were running locally.

Basic Connection

Connect to the server

# Connect using positional address
chatcli connect myserver:50051

# Connect with explicit flag
chatcli connect --addr myserver:50051

Verify server information

Upon connecting, ChatCLI displays server information:

Connected to ChatCLI server (version: 1.2.0, provider: CLAUDEAI, model: claude-sonnet-4-6)

If the server has an active K8s Watcher, it also shows:

K8s watcher active: deployment/myapp in namespace/production (context injected into all prompts)

All Flags

Flag	Description	Env Var
`--addr <host:port>`	Server address	`CHATCLI_REMOTE_ADDR`
`--token <string>`	Authentication token	`CHATCLI_REMOTE_TOKEN`
`--provider <name>`	Overrides the server’s LLM provider
`--model <name>`	Overrides the server’s LLM model
`--llm-key <string>`	Your own API key (sent to the server)	`CHATCLI_CLIENT_API_KEY`
`--use-local-auth`	Uses OAuth credentials from the local auth store
`--tls`	Enables TLS connection
`--ca-cert <path>`	CA certificate for TLS verification
`-p <prompt>`	One-shot mode: send prompt and exit
`--raw`	Raw output (no Markdown/ANSI formatting)
`--max-tokens <int>`	Maximum tokens in response

StackSpot Flags

Flag	Description
`--client-id`	StackSpot Client ID
`--client-key`	StackSpot Client Key
`--realm`	StackSpot Realm/Tenant
`--agent-id`	StackSpot Agent ID

Ollama Flags

Flag	Description
`--ollama-url`	Ollama base URL (e.g., `http://gpu:11434`)

Credential Modes

You can choose how to authenticate with the LLM provider:

Server Credentials
Your Own API Key
Local OAuth
StackSpot
Ollama

Do not send any credential flags. The server uses its own API keys:

chatcli connect myserver:50051

Send your key directly. The server uses it to make the LLM call:

chatcli connect myserver:50051 --provider OPENAI --llm-key sk-my-key

Use OAuth credentials already saved locally (from /auth login):

# Prerequisite: have completed OAuth login
# /auth login anthropic  (inside interactive chatcli)

# Connect using those credentials
chatcli connect myserver:50051 --use-local-auth

# With a specific provider
chatcli connect myserver:50051 --use-local-auth --provider CLAUDEAI

The --use-local-auth flag reads the OAuth token from ~/.chatcli/auth-profiles.json and sends it to the server. If you don’t specify --provider, ChatCLI tries Anthropic first, then OpenAI.

chatcli connect myserver:50051 --provider STACKSPOT \
  --client-id <id> --client-key <key> --realm <realm> --agent-id <agent>

chatcli connect myserver:50051 --provider OLLAMA --ollama-url http://gpu-server:11434

One-Shot Mode via Connect

Send a single prompt to the remote server and receive the response:

# Simple prompt
chatcli connect myserver:50051 -p "Explain K8s pods"

# With your credentials
chatcli connect myserver:50051 --use-local-auth -p "Summarize the cluster status"

# Raw output (no markdown) for use in scripts
chatcli connect myserver:50051 -p "List pods with issues" --raw

Interactive Mode

Without the -p flag, ChatCLI starts the full interactive mode:

chatcli connect myserver:50051

You have access to all ChatCLI features:

Sessions: /session save, /session load, /session list
Agent: /agent <task> or /run <task>
Coder: /coder <task>
Context: @file, @git, @command, @env, @history
Persistence: /context create, /context attach
Switch: /switch to change provider/model
Watcher: /watch status to see K8s Watcher status
Remote commands: /status, /plugins list, /agents list, /skills list

Remote Resource Discovery

Upon connecting, the client automatically discovers plugins, agents, and skills available on the server:

Connected to ChatCLI server (version: 1.3.0, provider: CLAUDEAI, model: claude-sonnet-4-6)
 Server has 3 plugins, 2 agents, 4 skills available

Remote Plugins
Remote Agents and Skills

Server plugins appear in /plugin list with the [remote] tag. They are executed on the server — the client sends the command via gRPC and receives the result:

# List plugins (local + remote)
/plugin list

Installed Plugins (2):
  * @hello          - Example plugin                        [local]
  * @k8s-diagnose   - K8s cluster diagnostics               [remote]

Server agents and skills are transferred to the client and composed locally, allowing merging with local resources:

# List agents (local + remote)
/agent list

Available Agents:
  * go-expert       - Go/Golang Specialist                  [local]
  * devops-senior   - Senior DevOps with K8s focus          [remote]

# Load a remote agent
/agent load devops-senior

Hybrid Mode

Local and remote plugins coexist; the [remote] prefix indicates the origin
Local and remote agents are listed together; when loading, resolution is transparent
When disconnecting (/disconnect), remote resources are automatically removed

Check K8s Watcher Status

If the server has an active K8s Watcher, you can query the status remotely:

# In interactive mode
/watch status

Example output:

K8s Watcher Status (Remote Server)
  Deployment:  myapp
  Namespace:   production
  Snapshots:   42
  Alerts:      2
  Pods:        3

Status Summary:
  3/3 pods running, 2 restarts last 1h
  Recent Events: Readiness probe succeeded on all pods

Environment Variables

Configure default values via environment variables to avoid typing flags every time:

# In your .bashrc or .zshrc
export CHATCLI_REMOTE_ADDR=myserver:50051
export CHATCLI_REMOTE_TOKEN=my-token

# Now just:
chatcli connect

TLS and Security

Insecure Connection (Development)

chatcli connect localhost:50051

When TLS is disabled, a warning is logged by the client as a reminder that the connection is not encrypted. In production, enable TLS.

Connection with TLS

chatcli connect myserver:50051 --tls

# With custom CA certificate
chatcli connect myserver:50051 --tls --ca-cert /path/to/ca.pem

Token + TLS (Production)

chatcli connect myserver:50051 --tls --token my-secret-token

For a complete security guide (authentication, container hardening, RBAC, etc.), see the security documentation.

Load Balancing with Multiple Replicas

When the ChatCLI server runs with multiple replicas in Kubernetes, the client automatically distributes connections across available pods:

The client uses client-side round-robin via gRPC dns:/// resolver
Requires a headless Service (ClusterIP: None) in Kubernetes
Built-in keepalive (ping every 10s) detects inactive pods and reconnects quickly
In the Helm chart, enable service.headless: true when replicaCount > 1
In the Operator, headless is activated automatically when spec.replicas > 1

Practical Examples

# Local development: server without auth
chatcli connect localhost:50051

# Production: TLS + auth + your credentials
chatcli connect prod-server:50051 --tls --token secret --use-local-auth

# CI/CD: one-shot with specific provider
chatcli connect ci-server:50051 --provider GOOGLEAI --llm-key AIzaSy-xxx \
  -p "Analyze this diff: $(git diff HEAD~1)" --raw

# GPU server with Ollama
chatcli connect gpu-box:50051 --provider OLLAMA --ollama-url http://localhost:11434

# StackSpot enterprise
chatcli connect corp-server:50051 --provider STACKSPOT \
  --client-id myid --client-key mykey --realm mytenant --agent-id myagent

Next Steps

Server Mode

Configure the server

Deploy

Deploy with Docker and Helm

K8s Watcher

Monitor Kubernetes

​Basic Connection

​All Flags

​StackSpot Flags

​Ollama Flags

​Credential Modes

​One-Shot Mode via Connect

​Interactive Mode

​Remote Resource Discovery

​Hybrid Mode

​Check K8s Watcher Status

​Environment Variables

​TLS and Security

​Load Balancing with Multiple Replicas

​Practical Examples

​Next Steps

Server Mode

Deploy

K8s Watcher

Basic Connection

All Flags

StackSpot Flags

Ollama Flags

Credential Modes

One-Shot Mode via Connect

Interactive Mode

Remote Resource Discovery

Hybrid Mode

Check K8s Watcher Status

Environment Variables

TLS and Security

Load Balancing with Multiple Replicas

Practical Examples

Next Steps