API Overview - ChatCLI

The ChatCLI AIOps Platform REST API provides full programmatic access to every platform feature. Built on Kubernetes-like patterns (apiVersion + kind + metadata + spec + status), with API key authentication and per-role rate limiting.

Incidents

Detection, ack, snooze, timeline, remediation, and resolution

Runbooks

Full CRUD for remediation plans

Analytics

MTTD, MTTR, trends, top resources, capacity, compliance

SLOs

Targets, error budget, burn rate, and history

Federation

Multi-cluster status, cross-tier correlations

Health

Liveness and readiness probes

Base URL

http://<operator-host>:8090/api/v1

The default port is 8090 but can be changed via Helm (--set apiPort=...) or env var CHATCLI_API_PORT. In production, expose behind an Ingress with TLS.

Request flow

Authentication

All requests must include the X-API-Key header with a valid key:

curl -H "X-API-Key: ck_live_abc123" \
  http://operator:8090/api/v1/incidents

Roles

viewer

Read-only. GET on all endpoints. Ideal for dashboards and observability tools.

operator

Daily ops. GET + POST actions (acknowledge, approve, reject). NOC, SRE, and on-call.

admin

Full access. GET, POST, PUT, DELETE. CI/CD, privileged automation, management tooling.

API keys are configured in the operator’s ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: chatcli-operator-config
  namespace: chatcli-system
data:
  api-keys: |
    - key: "ck_live_abc123..."
      role: admin
      description: "CI/CD Pipeline"
    - key: "ck_live_def456..."
      role: operator
      description: "NOC Team"
    - key: "ck_live_ghi789..."
      role: viewer
      description: "Grafana dashboard"

In dev environments, if the chatcli-api-keys ConfigMap is missing, the operator runs in dev mode without auth — useful for local tests, never for production.

Rate limiting

Role	Limit	Window
`viewer`	100 req	per minute
`operator`	500 req	per minute
`admin`	1000 req	per minute

Rate limit headers returned in every response:

X-RateLimit-Limit: 500
X-RateLimit-Remaining: 487
X-RateLimit-Reset: 1710864000

When the limit is exceeded the operator returns 429 Too Many Requests with a Retry-After header (seconds). Implement exponential backoff in production clients.

Response format

All responses follow a Kubernetes-like pattern:

List
Single resource
Error

{
  "apiVersion": "v1",
  "kind": "IncidentList",
  "metadata": {
    "totalCount": 42,
    "page": 1,
    "pageSize": 20
  },
  "items": [
    { "..." : "..." }
  ]
}

{
  "apiVersion": "v1",
  "kind": "Incident",
  "metadata": {
    "name": "INC-20260319-001",
    "namespace": "production",
    "createdAt": "2026-03-19T15:20:00Z"
  },
  "spec":   { "...": "..." },
  "status": { "...": "..." }
}

{
  "apiVersion": "v1",
  "kind": "Error",
  "error": {
    "code": 401,
    "message": "Invalid or missing API key",
    "details": "Include the X-API-Key header with a valid key"
  }
}

Error codes

Code	Meaning	When it happens
`400`	Bad Request	Missing or malformed parameters
`401`	Unauthorized	`X-API-Key` missing or invalid
`403`	Forbidden	Insufficient role for the operation
`404`	Not Found	Resource does not exist
`409`	Conflict	Resource already exists or invalid state for the operation
`429`	Too Many Requests	Rate limit exceeded — see `Retry-After`
`500`	Internal Server Error	Operator failure — inspect logs

Pagination

Endpoints that return lists support pagination via query parameters:

page

integer

default:"1"

Page number (starts at 1)

pageSize

integer

default:"20"

Items per page (maximum: 100)

curl -H "X-API-Key: $KEY" \
  "http://operator:8090/api/v1/incidents?page=2&pageSize=50"

The response includes metadata.totalCount so you can compute the total number of pages.

Versioning

The API uses path-based versioning (/api/v1/). Future versions will be added as /api/v2/ while maintaining backward compatibility with v1.

Breaking changes only happen across major versions. Within a version only compatible additions (new optional fields, new endpoints) are released.

Next steps

AIOps Platform overview

How the platform detects, analyzes, and remediates incidents

Kubernetes Operator

Operator deployment, CRDs, and configuration

Incident lifecycle

Full flow: detection → analysis → remediation → resolution

AIOps in production

Cookbook: full setup with TLS, RBAC, notifications, and SLOs

Incidents

Runbooks

Analytics

SLOs

Federation

Health

​Base URL

​Request flow

​Authentication

​Roles

viewer

operator

admin

​Rate limiting

​Response format

​Error codes

​Pagination

​Versioning

​Next steps

AIOps Platform overview

Kubernetes Operator

Incident lifecycle

AIOps in production

Base URL

Request flow

Authentication

Roles

Rate limiting

Response format

Error codes

Pagination

Versioning

Next steps