Skip to main content
ChatCLI can be packaged as a Docker container and deployed on Kubernetes using the official Helm chart. This page covers all deployment scenarios.

Official Images (GHCR)

Official Docker images are automatically published to the GitHub Container Registry with each release:

ChatCLI Server

ghcr.io/diillson/chatcli:latest

Kubernetes Operator

ghcr.io/diillson/chatcli-operator:latest
# Pull the server image
docker pull ghcr.io/diillson/chatcli:latest

# Or a specific version
docker pull ghcr.io/diillson/chatcli:v1.57.0

# Pull the operator image
docker pull ghcr.io/diillson/chatcli-operator:latest
The images support multi-arch (linux/amd64 and linux/arm64).

Docker

Building the Image (Local)

# From the project root
docker build -t chatcli .
The Dockerfile uses a multi-stage build to produce a minimal image (~20MB):
  • Build stage: golang:1.25-alpine compiles the binary
  • Runtime stage: alpine:3.21 with non-root user and built-in health check

Building the Operator Image (Local)

# IMPORTANT: must be built from the repository root
# (the operator's go.mod uses a replace directive pointing to ../)
docker build -f operator/Dockerfile -t ghcr.io/diillson/chatcli-operator:latest .
The operator Dockerfile uses:
  • Build stage: golang:1.25 with multi-arch support (TARGETARCH)
  • Runtime stage: gcr.io/distroless/static:nonroot (maximum security, no shell)

Running with Docker

docker run -p 50051:50051 \
  -e LLM_PROVIDER=OPENAI \
  -e OPENAI_API_KEY=sk-xxx \
  chatcli

Docker Compose

The project includes a docker-compose.yml ready for development:
1

Set the variables

export LLM_PROVIDER=OPENAI
export OPENAI_API_KEY=sk-xxx
2

Start the container

docker compose up -d
3

Connect from your terminal

chatcli connect localhost:50051
Docker Compose configures:
  • Port 50051 exposed
  • Persistent volumes for sessions and plugins
  • Automatic restart (unless-stopped)
  • All LLM variables via environment
  • Security hardening: read-only filesystem, no-new-privileges, CPU/memory limits, tmpfs for /tmp

docker-compose.yml File

version: "3.9"

services:
  chatcli-server:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: chatcli-server
    ports:
      - "50051:50051"
    environment:
      CHATCLI_SERVER_PORT: "50051"
      CHATCLI_SERVER_TOKEN: "${CHATCLI_SERVER_TOKEN:-}"
      LLM_PROVIDER: "${LLM_PROVIDER:-}"
      OPENAI_API_KEY: "${OPENAI_API_KEY:-}"
      ANTHROPIC_API_KEY: "${ANTHROPIC_API_KEY:-}"
      GOOGLEAI_API_KEY: "${GOOGLEAI_API_KEY:-}"
      OLLAMA_ENABLED: "${OLLAMA_ENABLED:-}"
      OLLAMA_BASE_URL: "${OLLAMA_BASE_URL:-}"
      GITHUB_COPILOT_TOKEN: "${GITHUB_COPILOT_TOKEN:-}"
      COPILOT_MODEL: "${COPILOT_MODEL:-}"
      LOG_LEVEL: "${LOG_LEVEL:-info}"
    volumes:
      - chatcli-sessions:/home/chatcli/.chatcli/sessions
      - chatcli-plugins:/home/chatcli/.chatcli/plugins
    restart: unless-stopped
    read_only: true
    tmpfs:
      - /tmp:size=100M
    security_opt:
      - no-new-privileges:true
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 1G

volumes:
  chatcli-sessions:
  chatcli-plugins:
The container runs with a read-only filesystem and no-new-privileges by default. The /tmp directory uses an in-memory tmpfs (limited to 100MB). The named volumes (chatcli-sessions, chatcli-plugins) are the only writable mount points. See the security documentation for details.

Kubernetes (Helm)

ChatCLI includes a complete Helm chart in deploy/helm/chatcli/.

Prerequisites

  • Kubernetes cluster (kind, minikube, EKS, GKE, AKS, etc.)
  • Helm 3.x installed
  • kubectl configured for the cluster

Basic Installation

helm install chatcli deploy/helm/chatcli \
  --set llm.provider=OPENAI \
  --set secrets.openaiApiKey=sk-xxx

Installation with K8s Watcher (Single-Target)

helm install chatcli deploy/helm/chatcli \
  --set llm.provider=OPENAI \
  --set secrets.openaiApiKey=sk-xxx \
  --set watcher.enabled=true \
  --set watcher.deployment=myapp \
  --set watcher.namespace=production

Installation with Multi-Target + Prometheus

To monitor multiple deployments with Prometheus metrics, use a values.yaml:
# values-multi.yaml
llm:
  provider: CLAUDEAI
secrets:
  anthropicApiKey: sk-ant-xxx
watcher:
  enabled: true
  interval: "15s"
  maxContextChars: 32000
  targets:
    - deployment: api-gateway
      namespace: production
      metricsPort: 9090
      metricsFilter: ["http_requests_*", "http_request_duration_*"]
    - deployment: auth-service
      namespace: production
      metricsPort: 9090
    - deployment: worker
      namespace: batch
helm install chatcli deploy/helm/chatcli -f values-multi.yaml
The chart automatically:
  • Creates a ServiceAccount with RBAC for the watcher to read pods, events, and logs
  • Auto-detects multi-namespace: if targets are in different namespaces, uses ClusterRole instead of Role
  • Generates a ConfigMap <name>-watch-config with the multi-target YAML
  • Mounts the config as a volume and passes --watch-config to the container

Helm Chart Values

Server

ValueDescriptionDefault
replicaCountNumber of replicas1
image.repositoryImage repositoryghcr.io/diillson/chatcli
image.tagImage taglatest
server.portgRPC port50051
server.metricsPortHTTP port for Prometheus metrics (0 = disabled)9090
server.tokenAuthentication token""
serviceMonitor.enabledCreate ServiceMonitor (requires Prometheus Operator)false
serviceMonitor.intervalPrometheus scrape interval30s

TLS

ValueDescriptionDefault
tls.enabledEnable TLSfalse
tls.certFileCertificate path""
tls.keyFileKey path""
tls.existingSecretExisting Secret with certs""

LLM

ValueDescriptionDefault
llm.providerDefault provider""
llm.modelDefault model""

Secrets (API Keys)

ValueDescription
secrets.existingSecretExisting Secret (instead of creating a new one)
secrets.openaiApiKeyOpenAI key
secrets.anthropicApiKeyAnthropic key
secrets.googleaiApiKeyGoogle AI key
secrets.xaiApiKeyxAI key
secrets.stackspotClientIdStackSpot Client ID
secrets.stackspotClientKeyStackSpot Client Key
secrets.stackspotRealmStackSpot Realm
secrets.stackspotAgentIdStackSpot Agent ID
secrets.githubCopilotTokenGitHub Copilot OAuth token

GitHub Copilot

ValueDescriptionDefault
COPILOT_MODELDefault Copilot model (e.g., gpt-4o, claude-sonnet-4)gpt-4o
COPILOT_MAX_TOKENSMaximum tokens for response""
COPILOT_API_BASE_URLAPI base URL (for enterprise environments)https://api.githubcopilot.com
For authentication, use secrets.githubCopilotToken with a token obtained via /auth login github-copilot, or set GITHUB_COPILOT_TOKEN as an environment variable.

Ollama

ValueDescriptionDefault
ollama.enabledEnable Ollamafalse
ollama.baseUrlOllama base URLhttp://ollama:11434
ollama.modelOllama model""

K8s Watcher

ValueDescriptionDefault
watcher.enabledEnable the watcherfalse
watcher.targetsMulti-deployment target list (see below)[]
watcher.deploymentSingle deployment - legacy""
watcher.namespaceDeployment namespace - legacy""
watcher.intervalCollection interval30s
watcher.windowObservation window2h
watcher.maxLogLinesLog lines per pod100
watcher.maxContextCharsLLM context budget32000
Fields for each target (watcher.targets[].):
FieldDescriptionRequired
deploymentDeployment nameYes
namespaceNamespace (default: default)No
metricsPortPrometheus port (0 = disabled)No
metricsPathHTTP metrics pathNo (/metrics)
metricsFilterGlob filters for metricsNo

Provider Fallback

ValueDescriptionDefault
fallback.enabledEnable automatic failover chainfalse
fallback.providersOrdered list of providers [{name, model}][]
fallback.maxRetriesRetries per provider before advancing2
fallback.cooldownBaseBase cooldown after failure30s
fallback.cooldownMaxMaximum cooldown (exponential backoff)5m

MCP (Model Context Protocol)

ValueDescriptionDefault
mcp.enabledEnable MCP integrationfalse
mcp.serversList of MCP servers [{name, transport, command, args, url, enabled}][]
mcp.existingConfigMapExisting ConfigMap with mcp_servers.json""

Bootstrap and Memory

ValueDescriptionDefault
bootstrap.enabledLoad bootstrap files (SOUL.md, USER.md, etc.)false
bootstrap.definitionsInline bootstrap file definitions{}
bootstrap.existingConfigMapExisting ConfigMap with bootstrap files""
memory.enabledEnable persistent memoryfalse
safety.enabledEnable configurable safety rulesfalse

Skill Registry

ValueDescriptionDefault
skillRegistry.enabledEnable environment variables for skill registryfalse
skillRegistry.registryUrlsAdditional registry URLs (comma-separated)""
skillRegistry.registryDisableRegistry names to disable (comma-separated)""
skillRegistry.installDirSkill installation directory inside the container""
When enabled, the values are passed as CHATCLI_REGISTRY_* environment variables in the ConfigMap. The ChatCLI container automatically creates ~/.chatcli/registries.yaml with the default registries (chatcli, clawhub). Use /skill search and /skill install to manage skills via registries.

Persistence

ValueDescriptionDefault
persistence.enabledPersist sessions in PVCtrue
persistence.storageClassStorage class""
persistence.sizeVolume size1Gi

Security

ValueDescriptionDefault
podSecurityContext.runAsNonRootEnforce non-root executiontrue
podSecurityContext.runAsUserProcess UID1000
podSecurityContext.seccompProfile.typeSeccomp profileRuntimeDefault
securityContext.allowPrivilegeEscalationAllow privilege escalationfalse
securityContext.readOnlyRootFilesystemRead-only filesystemtrue
securityContext.capabilities.dropDropped capabilitiesALL
rbac.clusterWideUse ClusterRole instead of namespace-scoped Rolefalse
When readOnlyRootFilesystem is true, the chart automatically mounts a tmpfs at /tmp and an emptyDir at /home/chatcli/.chatcli (200Mi) for runtime data. The HOME=/home/chatcli variable is set automatically. To monitor multiple namespaces, enable rbac.clusterWide: true. See the security documentation for details. Note: The ConfigMap and Secret referenced via envFrom are marked as optional: true, allowing you to create the Instance/Deployment before the dependent resources. The operator watches Secrets automatically and triggers rolling updates when they are created or updated.

Networking

ValueDescriptionDefault
service.typeService typeClusterIP
service.portService port50051
service.headlessEnable headless Service for gRPC client-side load balancing (recommended when replicaCount > 1)false
ingress.enabledEnable Ingressfalse
gRPC and multiple replicas: gRPC uses persistent HTTP/2 connections that pin to a single pod. For replicaCount > 1, enable service.headless: true to activate round-robin load balancing via DNS. The client already has built-in keepalive and round-robin support.

Using an Existing Secret

If you already have a Secret with the API keys:
helm install chatcli deploy/helm/chatcli \
  --set llm.provider=OPENAI \
  --set secrets.existingSecret=my-llm-keys
The Secret must contain the expected keys:
apiVersion: v1
kind: Secret
metadata:
  name: my-llm-keys
type: Opaque
stringData:
  OPENAI_API_KEY: "sk-xxx"
  ANTHROPIC_API_KEY: "sk-ant-xxx"
  GITHUB_COPILOT_TOKEN: "ghu_xxx"  # optional

Accessing the Server

kubectl port-forward svc/chatcli 50051:50051
chatcli connect localhost:50051

Ingress (with TLS)

# values-prod.yaml
ingress:
  enabled: true
  className: nginx
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
    - host: chatcli.mydomain.com
      paths:
        - path: /
          pathType: ImplementationSpecific
  tls:
    - secretName: chatcli-tls
      hosts:
        - chatcli.mydomain.com
helm install chatcli deploy/helm/chatcli -f values-prod.yaml

Upgrade and Rollback

# Upgrade
helm upgrade chatcli deploy/helm/chatcli --set llm.model=gpt-4-turbo

# Rollback
helm rollback chatcli 1

Full Example: Production

Single-Target (Legacy)

helm install chatcli deploy/helm/chatcli \
  --namespace chatcli --create-namespace \
  --set llm.provider=CLAUDEAI \
  --set secrets.anthropicApiKey=sk-ant-xxx \
  --set server.token=super-secret-token \
  --set tls.enabled=true \
  --set tls.existingSecret=chatcli-tls-certs \
  --set watcher.enabled=true \
  --set watcher.deployment=production-app \
  --set watcher.namespace=production \
  --set persistence.enabled=true \
  --set persistence.size=5Gi \
  --set resources.requests.memory=256Mi \
  --set resources.limits.memory=1Gi
# values-prod.yaml
llm:
  provider: CLAUDEAI
secrets:
  existingSecret: chatcli-llm-keys
server:
  token: super-secret-token
tls:
  enabled: true
  existingSecret: chatcli-tls-certs
watcher:
  enabled: true
  interval: "15s"
  maxContextChars: 10000
  targets:
    - deployment: api-gateway
      namespace: production
      metricsPort: 9090
      metricsFilter: ["http_requests_*", "http_request_duration_*"]
    - deployment: auth-service
      namespace: production
      metricsPort: 9090
    - deployment: payment-service
      namespace: production
      metricsPort: 9090
      metricsFilter: ["payment_*", "stripe_*"]
    - deployment: worker
      namespace: batch
persistence:
  enabled: true
  size: 5Gi
resources:
  requests:
    memory: 256Mi
  limits:
    memory: 1Gi
helm install chatcli deploy/helm/chatcli \
  --namespace chatcli --create-namespace \
  -f values-prod.yaml
When targets are in different namespaces (e.g., production and batch), the chart automatically creates a ClusterRole instead of a namespace-scoped Role.

Next Steps