Skip to main content
The K8s Watcher allows ChatCLI to monitor multiple deployments simultaneously, collecting infrastructure and application metrics, logs, events, and pod status. The context is automatically injected into LLM prompts with intelligent budget management to avoid exceeding the context window.

Architecture

ChatCLI -> ResourceWatcher -> 6 Collectors -> ObservabilityStore -> Summarizer -> LLM
Each ResourceWatcher has its own collectors (including an optional PrometheusCollector) and all share a single Kubernetes clientset, minimizing connections.

Usage Modes

chatcli watch --deployment myapp --namespace production
chatcli watch --deployment myapp -p "Is the deployment healthy?"

Multi-Target Configuration File

# targets.yaml
interval: "30s"           # Collection interval
window: "2h"              # Time window of retained data
maxLogLines: 100          # Log lines per pod per cycle
maxContextChars: 32000     # Maximum character budget for LLM context

targets:
  - deployment: api-gateway
    namespace: production
    metricsPort: 9090                                        # Prometheus port
    metricsFilter: ["http_requests_total", "http_request_duration_*"]

  - deployment: auth-service
    namespace: production
    metricsPort: 9090

  - deployment: worker
    namespace: batch
    # No metricsPort = Prometheus disabled for this target

  - deployment: frontend
    namespace: production
    metricsPort: 3000
    metricsPath: "/custom-metrics"                           # Custom path
    metricsFilter: ["next_*", "react_render_*"]

Target Fields

FieldDescriptionRequired
deploymentDeployment nameYes
namespaceNamespace (default: default)No
metricsPortPrometheus endpoint port (0 = disabled)No
metricsPathHTTP path for metrics (default: /metrics)No
metricsFilterGlob filters for metrics (empty = all)No

Complete Flags

chatcli watch

FlagDescriptionDefaultEnv Var
--configMulti-target YAML file
--deploymentSingle deployment (legacy)CHATCLI_WATCH_DEPLOYMENT
--namespaceDeployment namespacedefaultCHATCLI_WATCH_NAMESPACE
--intervalInterval between collections30sCHATCLI_WATCH_INTERVAL
--windowData time window2hCHATCLI_WATCH_WINDOW
--max-log-linesLog lines per pod100CHATCLI_WATCH_MAX_LOG_LINES
--kubeconfigKubeconfig pathAuto-detectedCHATCLI_KUBECONFIG
--providerLLM provider.envLLM_PROVIDER
--modelLLM model.env
-p <prompt>One-shot: send and exit
--max-tokensToken limit in response

chatcli server (watcher flags)

FlagDescriptionDefaultEnv Var
--watch-configMulti-target YAML fileCHATCLI_WATCH_CONFIG
--watch-deploymentSingle deployment (legacy)CHATCLI_WATCH_DEPLOYMENT
--watch-namespaceNamespacedefaultCHATCLI_WATCH_NAMESPACE
--watch-intervalCollection interval30sCHATCLI_WATCH_INTERVAL
--watch-windowObservation window2hCHATCLI_WATCH_WINDOW
--watch-max-log-linesMax log lines100CHATCLI_WATCH_MAX_LOG_LINES
--watch-kubeconfigKubeconfig pathAuto-detectedCHATCLI_KUBECONFIG

What Is Collected

Collectors per Target

CollectorData Collected
DeploymentReplicas (ready/available/updated), strategy, conditions
Pod StatusPhase, readiness, restarts, termination info, container status
EventsK8s events (Warning/Normal), message, reason, timestamp
LogsLast N lines per container per pod
MetricsCPU and memory per pod (via metrics-server)
HPAMin/max replicas, current metrics, desired replicas
PrometheusApplication metrics from the pod /metrics endpoint

Prometheus Collector (New)

The PrometheusCollector scrapes Prometheus metrics directly from pods:
  • Discovers deployment pods and selects 1 Ready pod
  • Makes HTTP GET to http://podIP:port/path (timeout: 5s)
  • Parses the Prometheus text exposition format (stdlib, no dependencies)
  • Filters by configured glob patterns
  • Ignores NaN, Inf, and comment lines
Glob filter examples:
metricsFilter:
  - "http_requests_*"          # All HTTP metrics
  - "process_*"                # Process metrics
  - "go_goroutines"            # Specific metric
  - "*_duration_seconds_*"     # Any duration metric

Context Budget Management (MultiSummarizer)

With multiple targets, the MultiSummarizer ensures the context does not exceed the LLM window:

Algorithm

1

Scores each target

0 = healthy, 1 = warning, 2 = critical
  • Critical: CrashLoopBackOff, OOMKilled, critical alerts
  • Warning: replicas < desired, error logs, warning alerts
  • Healthy: everything ok
2

Sorts by priority

Critical first, then warning, then healthy.
3

Allocates context

  • Score >= 1 — full context (~1-3 KB per target)
  • Score == 0 — compact one-liner (~80 chars per target)
4

Compresses if exceeding maxContextChars

Compresses healthy targets first.
5

Omits if still exceeding

Omits healthy targets when necessary.

Example with 20 Targets (2 with issues)

[K8s Multi-Watcher: 20 targets monitored]

--- Targets Requiring Attention ---

[K8s Context: deployment/api-gateway in namespace/production]
Collected at: 2026-02-15T10:30:00Z

## Deployment Status
  Replicas: 2/3 ready, 3 updated, 2 available
  Strategy: RollingUpdate

## Pods (3 total)
  Total restarts: 12 (delta in window: 8)
  - api-gateway-abc12: Running [Ready] restarts=0 cpu=45m mem=128Mi
  - api-gateway-def34: Running [Ready] restarts=0 cpu=52m mem=135Mi
  - api-gateway-ghi56: Running [NOT READY] restarts=8 cpu=12m mem=95Mi
    Last terminated: OOMKilled (exit code 137) at 2026-02-15T10:28:00Z

## Application Metrics (4)
  http_request_duration_seconds_sum: 8453
  http_requests_total: 1.542e+06
  process_resident_memory_bytes: 1.34e+08
  go_goroutines: 245

## Active Alerts (2)
  [CRITICAL] CrashLoopBackOff: pod/api-gateway-ghi56
  [CRITICAL] OOMKilled: pod/api-gateway-ghi56

## Recent Error Logs (3)
  [10:27:45] api-gateway-ghi56/app: OutOfMemoryError: heap space
  [10:27:46] api-gateway-ghi56/app: Shutting down...
  [10:28:00] api-gateway-ghi56/app: Process exited with code 137

--- Healthy Targets ---
- production/auth-service: 3/3 pods ready | healthy | 0 alerts | 42 snapshots
- production/frontend: 2/2 pods ready | healthy | 0 alerts | 42 snapshots
- production/backend: 5/5 pods ready | healthy | 0 alerts | 42 snapshots
- batch/worker: 3/3 pods ready | healthy | 0 alerts | 42 snapshots
... (16 compact targets)
Total budget: ~2 KB (detail) + 18 x 80 chars (compact) = ~3.5 KB, within the 8 KB limit.

Anomaly Detection

AnomalyConditionSeverity
CrashLoopBackOffPod with more than 5 restartsCritical
OOMKilledContainer terminated due to lack of memoryCritical
PodNotReadyPod is not in the Ready stateWarning
DeploymentFailingDeployment with Available=FalseCritical
Alerts are included in the context sent to the LLM and influence the budget priority of the MultiSummarizer.

Observability Store

Collected data is stored in a ring buffer per target with a configurable time window:
  • Snapshots: Complete periodic state (pods, deployment, HPA, events, metrics, app metrics)
  • Logs: Recent logs from each pod with classification (info/warning/error)
  • Alerts: Detected anomalies with severity and timestamps

Automatic Rotation

Data older than the time window (--window) is automatically discarded, keeping memory usage constant regardless of the number of targets.

/watch Command

Inside interactive ChatCLI (local or remote), use /watch to see the status:
/watch
K8s Watcher Active
  Deployment:  myapp
  Namespace:   production
  Snapshots:   42
  Pods:        3
  Alerts:      1

One-Shot with K8s Context

# Single deployment
chatcli watch --deployment myapp -p "Is the deployment healthy?"

# Multi-target
chatcli watch --config targets.yaml -p "Summarize the status of all deployments"

# Via remote server
chatcli connect myserver:50051 -p "Why are the pods restarting?"

Example Questions

> Is the deployment healthy?
> Which deployments need attention?
> Why is pod xyz restarting?
> Analyze the HTTP metrics of api-gateway. Is the latency acceptable?
> Compare the auth-service state with 30 minutes ago
> What warning events occurred in the last hour?
> Based on the Prometheus metrics, do I need to scale any deployment?
> Summarize the status of all targets for a team report

Requirements

  • Kubernetes Cluster: Access via kubeconfig or in-cluster config
  • RBAC Permissions: Read access to pods, events, logs, deployments, HPA, ingresses
  • metrics-server (optional): For CPU/memory collection
  • Prometheus endpoints (optional): Apps that expose /metrics in Prometheus text format

RBAC

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: chatcli-watcher
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["pods", "pods/log", "events", "services", "endpoints"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources: ["deployments", "replicasets", "statefulsets", "daemonsets"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["autoscaling"]
    resources: ["horizontalpodautoscalers"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["networking.k8s.io"]
    resources: ["ingresses"]
    verbs: ["get", "list"]
  - apiGroups: ["metrics.k8s.io"]
    resources: ["pods"]
    verbs: ["get", "list"]

AIOps Integration

K8s Watcher alerts automatically feed into the Operator’s AIOps pipeline. When the Operator detects alerts via GetAlerts RPC, it creates Anomaly CRs that are correlated into Issues, analyzed by AI, and automatically remediated.
Alerts detected by Watcher -> Anomaly -> Issue -> AIInsight -> RemediationPlan -> Resolution
See AIOps Platform for the complete flow.

Next Steps