ChatCLI can be packaged as a Docker container and deployed on Kubernetes using the official Helm chart. This page covers all deployment scenarios.
Official Images (GHCR)
Official Docker images are automatically published to the GitHub Container Registry with each release:
ChatCLI Server ghcr.io/diillson/chatcli:latest
Kubernetes Operator ghcr.io/diillson/chatcli-operator:latest
# Pull the server image
docker pull ghcr.io/diillson/chatcli:latest
# Or a specific version
docker pull ghcr.io/diillson/chatcli:v1.57.0
# Pull the operator image
docker pull ghcr.io/diillson/chatcli-operator:latest
The images support multi-arch (linux/amd64 and linux/arm64).
Docker
Building the Image (Local)
# From the project root
docker build -t chatcli .
The Dockerfile uses a multi-stage build to produce a minimal image (~20MB):
Build stage : golang:1.25-alpine compiles the binary
Runtime stage : alpine:3.21 with non-root user and built-in health check
Building the Operator Image (Local)
# IMPORTANT: must be built from the repository root
# (the operator's go.mod uses a replace directive pointing to ../)
docker build -f operator/Dockerfile -t ghcr.io/diillson/chatcli-operator:latest .
The operator Dockerfile uses:
Build stage : golang:1.25 with multi-arch support (TARGETARCH)
Runtime stage : gcr.io/distroless/static:nonroot (maximum security, no shell)
Running with Docker
Basic
With Auth
With Persistence
docker run -p 50051:50051 \
-e LLM_PROVIDER=OPENAI \
-e OPENAI_API_KEY=sk-xxx \
chatcli
docker run -p 50051:50051 \
-e CHATCLI_SERVER_TOKEN=my-token \
-e LLM_PROVIDER=CLAUDEAI \
-e ANTHROPIC_API_KEY=sk-ant-xxx \
chatcli
docker run -p 50051:50051 \
-v chatcli-sessions:/home/chatcli/.chatcli/sessions \
-e LLM_PROVIDER=OPENAI \
-e OPENAI_API_KEY=sk-xxx \
chatcli
Docker Compose
The project includes a docker-compose.yml ready for development:
Set the variables
export LLM_PROVIDER = OPENAI
export OPENAI_API_KEY = sk-xxx
Connect from your terminal
chatcli connect localhost:50051
Docker Compose configures:
Port 50051 exposed
Persistent volumes for sessions and plugins
Automatic restart (unless-stopped)
All LLM variables via environment
Security hardening : read-only filesystem, no-new-privileges, CPU/memory limits, tmpfs for /tmp
docker-compose.yml File
version : "3.9"
services :
chatcli-server :
build :
context : .
dockerfile : Dockerfile
container_name : chatcli-server
ports :
- "50051:50051"
environment :
CHATCLI_SERVER_PORT : "50051"
CHATCLI_SERVER_TOKEN : "${CHATCLI_SERVER_TOKEN:-}"
LLM_PROVIDER : "${LLM_PROVIDER:-}"
OPENAI_API_KEY : "${OPENAI_API_KEY:-}"
ANTHROPIC_API_KEY : "${ANTHROPIC_API_KEY:-}"
GOOGLEAI_API_KEY : "${GOOGLEAI_API_KEY:-}"
OLLAMA_ENABLED : "${OLLAMA_ENABLED:-}"
OLLAMA_BASE_URL : "${OLLAMA_BASE_URL:-}"
GITHUB_COPILOT_TOKEN : "${GITHUB_COPILOT_TOKEN:-}"
COPILOT_MODEL : "${COPILOT_MODEL:-}"
LOG_LEVEL : "${LOG_LEVEL:-info}"
volumes :
- chatcli-sessions:/home/chatcli/.chatcli/sessions
- chatcli-plugins:/home/chatcli/.chatcli/plugins
restart : unless-stopped
read_only : true
tmpfs :
- /tmp:size=100M
security_opt :
- no-new-privileges:true
deploy :
resources :
limits :
cpus : "2.0"
memory : 1G
volumes :
chatcli-sessions :
chatcli-plugins :
The container runs with a read-only filesystem and no-new-privileges by default. The /tmp directory uses an in-memory tmpfs (limited to 100MB). The named volumes (chatcli-sessions, chatcli-plugins) are the only writable mount points. See the security documentation for details.
Kubernetes (Helm)
ChatCLI includes a complete Helm chart in deploy/helm/chatcli/.
Prerequisites
Kubernetes cluster (kind, minikube, EKS, GKE, AKS, etc.)
Helm 3.x installed
kubectl configured for the cluster
Basic Installation
OpenAI
Anthropic (with Auth)
helm install chatcli deploy/helm/chatcli \
--set llm.provider=OPENAI \
--set secrets.openaiApiKey=sk-xxx
helm install chatcli deploy/helm/chatcli \
--set llm.provider=CLAUDEAI \
--set secrets.anthropicApiKey=sk-ant-xxx \
--set server.token=my-secret-token
Installation with K8s Watcher (Single-Target)
helm install chatcli deploy/helm/chatcli \
--set llm.provider=OPENAI \
--set secrets.openaiApiKey=sk-xxx \
--set watcher.enabled= true \
--set watcher.deployment=myapp \
--set watcher.namespace=production
Installation with Multi-Target + Prometheus
To monitor multiple deployments with Prometheus metrics, use a values.yaml:
# values-multi.yaml
llm :
provider : CLAUDEAI
secrets :
anthropicApiKey : sk-ant-xxx
watcher :
enabled : true
interval : "15s"
maxContextChars : 32000
targets :
- deployment : api-gateway
namespace : production
metricsPort : 9090
metricsFilter : [ "http_requests_*" , "http_request_duration_*" ]
- deployment : auth-service
namespace : production
metricsPort : 9090
- deployment : worker
namespace : batch
helm install chatcli deploy/helm/chatcli -f values-multi.yaml
The chart automatically:
Creates a ServiceAccount with RBAC for the watcher to read pods, events, and logs
Auto-detects multi-namespace : if targets are in different namespaces, uses ClusterRole instead of Role
Generates a ConfigMap <name>-watch-config with the multi-target YAML
Mounts the config as a volume and passes --watch-config to the container
Helm Chart Values
Server
Value Description Default replicaCountNumber of replicas 1image.repositoryImage repository ghcr.io/diillson/chatcliimage.tagImage tag latestserver.portgRPC port 50051server.metricsPortHTTP port for Prometheus metrics (0 = disabled) 9090server.tokenAuthentication token ""serviceMonitor.enabledCreate ServiceMonitor (requires Prometheus Operator) falseserviceMonitor.intervalPrometheus scrape interval 30s
TLS
Value Description Default tls.enabledEnable TLS falsetls.certFileCertificate path ""tls.keyFileKey path ""tls.existingSecretExisting Secret with certs ""
LLM
Value Description Default llm.providerDefault provider ""llm.modelDefault model ""
Secrets (API Keys)
Value Description secrets.existingSecretExisting Secret (instead of creating a new one) secrets.openaiApiKeyOpenAI key secrets.anthropicApiKeyAnthropic key secrets.googleaiApiKeyGoogle AI key secrets.xaiApiKeyxAI key secrets.stackspotClientIdStackSpot Client ID secrets.stackspotClientKeyStackSpot Client Key secrets.stackspotRealmStackSpot Realm secrets.stackspotAgentIdStackSpot Agent ID secrets.githubCopilotTokenGitHub Copilot OAuth token
GitHub Copilot
Value Description Default COPILOT_MODELDefault Copilot model (e.g., gpt-4o, claude-sonnet-4) gpt-4oCOPILOT_MAX_TOKENSMaximum tokens for response ""COPILOT_API_BASE_URLAPI base URL (for enterprise environments) https://api.githubcopilot.com
For authentication, use secrets.githubCopilotToken with a token obtained via /auth login github-copilot, or set GITHUB_COPILOT_TOKEN as an environment variable.
Ollama
Value Description Default ollama.enabledEnable Ollama falseollama.baseUrlOllama base URL http://ollama:11434ollama.modelOllama model ""
K8s Watcher
Value Description Default watcher.enabledEnable the watcher falsewatcher.targetsMulti-deployment target list (see below) []watcher.deploymentSingle deployment - legacy ""watcher.namespaceDeployment namespace - legacy ""watcher.intervalCollection interval 30swatcher.windowObservation window 2hwatcher.maxLogLinesLog lines per pod 100watcher.maxContextCharsLLM context budget 32000
Fields for each target (watcher.targets[].):
Field Description Required deploymentDeployment name Yes namespaceNamespace (default: default) No metricsPortPrometheus port (0 = disabled) No metricsPathHTTP metrics path No (/metrics) metricsFilterGlob filters for metrics No
Provider Fallback
Value Description Default fallback.enabledEnable automatic failover chain falsefallback.providersOrdered list of providers [{name, model}] []fallback.maxRetriesRetries per provider before advancing 2fallback.cooldownBaseBase cooldown after failure 30sfallback.cooldownMaxMaximum cooldown (exponential backoff) 5m
MCP (Model Context Protocol)
Value Description Default mcp.enabledEnable MCP integration falsemcp.serversList of MCP servers [{name, transport, command, args, url, enabled}] []mcp.existingConfigMapExisting ConfigMap with mcp_servers.json ""
Bootstrap and Memory
Value Description Default bootstrap.enabledLoad bootstrap files (SOUL.md, USER.md, etc.) falsebootstrap.definitionsInline bootstrap file definitions {}bootstrap.existingConfigMapExisting ConfigMap with bootstrap files ""memory.enabledEnable persistent memory falsesafety.enabledEnable configurable safety rules false
Skill Registry
Value Description Default skillRegistry.enabledEnable environment variables for skill registry falseskillRegistry.registryUrlsAdditional registry URLs (comma-separated) ""skillRegistry.registryDisableRegistry names to disable (comma-separated) ""skillRegistry.installDirSkill installation directory inside the container ""
When enabled, the values are passed as CHATCLI_REGISTRY_* environment variables in the ConfigMap. The ChatCLI container automatically creates ~/.chatcli/registries.yaml with the default registries (chatcli, clawhub). Use /skill search and /skill install to manage skills via registries.
Persistence
Value Description Default persistence.enabledPersist sessions in PVC truepersistence.storageClassStorage class ""persistence.sizeVolume size 1Gi
Security
Value Description Default podSecurityContext.runAsNonRootEnforce non-root execution truepodSecurityContext.runAsUserProcess UID 1000podSecurityContext.seccompProfile.typeSeccomp profile RuntimeDefaultsecurityContext.allowPrivilegeEscalationAllow privilege escalation falsesecurityContext.readOnlyRootFilesystemRead-only filesystem truesecurityContext.capabilities.dropDropped capabilities ALLrbac.clusterWideUse ClusterRole instead of namespace-scoped Role false
When readOnlyRootFilesystem is true, the chart automatically mounts a tmpfs at /tmp and an emptyDir at /home/chatcli/.chatcli (200Mi) for runtime data. The HOME=/home/chatcli variable is set automatically. To monitor multiple namespaces, enable rbac.clusterWide: true. See the security documentation for details.
Note : The ConfigMap and Secret referenced via envFrom are marked as optional: true, allowing you to create the Instance/Deployment before the dependent resources. The operator watches Secrets automatically and triggers rolling updates when they are created or updated.
Networking
Value Description Default service.typeService type ClusterIPservice.portService port 50051service.headlessEnable headless Service for gRPC client-side load balancing (recommended when replicaCount > 1) falseingress.enabledEnable Ingress false
gRPC and multiple replicas : gRPC uses persistent HTTP/2 connections that pin to a single pod. For replicaCount > 1, enable service.headless: true to activate round-robin load balancing via DNS. The client already has built-in keepalive and round-robin support.
Using an Existing Secret
If you already have a Secret with the API keys:
helm install chatcli deploy/helm/chatcli \
--set llm.provider=OPENAI \
--set secrets.existingSecret=my-llm-keys
The Secret must contain the expected keys:
apiVersion : v1
kind : Secret
metadata :
name : my-llm-keys
type : Opaque
stringData :
OPENAI_API_KEY : "sk-xxx"
ANTHROPIC_API_KEY : "sk-ant-xxx"
GITHUB_COPILOT_TOKEN : "ghu_xxx" # optional
Accessing the Server
Port Forward (Dev)
NodePort
LoadBalancer
kubectl port-forward svc/chatcli 50051:50051
chatcli connect localhost:50051
helm install chatcli deploy/helm/chatcli \
--set service.type=NodePort
chatcli connect < node-i p > : < node-por t >
helm install chatcli deploy/helm/chatcli \
--set service.type=LoadBalancer
# Wait for the external IP
kubectl get svc chatcli -w
chatcli connect < external-i p > :50051
Ingress (with TLS)
# values-prod.yaml
ingress :
enabled : true
className : nginx
annotations :
cert-manager.io/cluster-issuer : letsencrypt-prod
hosts :
- host : chatcli.mydomain.com
paths :
- path : /
pathType : ImplementationSpecific
tls :
- secretName : chatcli-tls
hosts :
- chatcli.mydomain.com
helm install chatcli deploy/helm/chatcli -f values-prod.yaml
Upgrade and Rollback
# Upgrade
helm upgrade chatcli deploy/helm/chatcli --set llm.model=gpt-4-turbo
# Rollback
helm rollback chatcli 1
Full Example: Production
Single-Target (Legacy)
helm install chatcli deploy/helm/chatcli \
--namespace chatcli --create-namespace \
--set llm.provider=CLAUDEAI \
--set secrets.anthropicApiKey=sk-ant-xxx \
--set server.token=super-secret-token \
--set tls.enabled= true \
--set tls.existingSecret=chatcli-tls-certs \
--set watcher.enabled= true \
--set watcher.deployment=production-app \
--set watcher.namespace=production \
--set persistence.enabled= true \
--set persistence.size=5Gi \
--set resources.requests.memory=256Mi \
--set resources.limits.memory=1Gi
Multi-Target with Prometheus (Recommended)
# values-prod.yaml
llm :
provider : CLAUDEAI
secrets :
existingSecret : chatcli-llm-keys
server :
token : super-secret-token
tls :
enabled : true
existingSecret : chatcli-tls-certs
watcher :
enabled : true
interval : "15s"
maxContextChars : 10000
targets :
- deployment : api-gateway
namespace : production
metricsPort : 9090
metricsFilter : [ "http_requests_*" , "http_request_duration_*" ]
- deployment : auth-service
namespace : production
metricsPort : 9090
- deployment : payment-service
namespace : production
metricsPort : 9090
metricsFilter : [ "payment_*" , "stripe_*" ]
- deployment : worker
namespace : batch
persistence :
enabled : true
size : 5Gi
resources :
requests :
memory : 256Mi
limits :
memory : 1Gi
helm install chatcli deploy/helm/chatcli \
--namespace chatcli --create-namespace \
-f values-prod.yaml
When targets are in different namespaces (e.g., production and batch), the chart automatically creates a ClusterRole instead of a namespace-scoped Role.
Next Steps