Bootstrap and Persistent Memory

ChatCLI offers two complementary systems for customizing and contextualizing the agent: bootstrap files to define personality and rules, and persistent memory to maintain context across sessions.

The bootstrap and memory system is fully connected to the system prompt flow. Files are automatically loaded and injected into all interactions — in chat mode as well as in /agent and /coder modes.

Bootstrap Files

Bootstrap files are Markdown documents automatically loaded into the agent’s system prompt. They define who the assistant is, how it behaves, and which rules it should follow.

Supported Files

ChatCLI loads exactly 5 bootstrap files, in this order. All are optional — if they don’t exist, they are simply skipped:

File	Purpose	When to use
`AGENTS.md`	Sub-agent definitions and their roles	When you want to instruct the orchestrator about available agents and how to use them
`SOUL.md`	Assistant personality, tone and style	To define “who” the assistant is — how it speaks, thinks and behaves
`USER.md`	User preferences and project context	To inform the stack, conventions, preferred tools and project context
`IDENTITY.md`	Agent identity and capabilities	To define “what” the assistant is — name, capabilities, limitations
`RULES.md`	Explicit rules and restrictions	For strict guardrails — what it MUST and MUST NOT do

File names are exact and case-sensitive. ChatCLI only looks for AGENTS.md, SOUL.md, USER.md, IDENTITY.md, and RULES.md. Other names (like CLAUDE.md, README.md, etc.) are not loaded by the bootstrap system.

Loading Priority

Files are searched at two levels, with the workspace taking priority:

Workspace (project root)

Project-specific configurations. Takes priority over global. The project root is detected automatically (see below).

Global (~/.chatcli/)

User default configurations. Serves as a fallback when the file does not exist in the workspace.

If the same file exists at both levels, the workspace version prevails. Global files serve as fallback.

Automatic Workspace Detection

ChatCLI uses detectProjectDir() to find the real project root. Instead of simply using the current directory (CWD), it walks up the directory tree looking for project markers:

Checks if the current directory contains .git/ or .agent/
If not found, moves to the parent directory and repeats
Continues until a marker is found or it reaches the filesystem root
If no marker is found, falls back to the CWD

This means you can run ChatCLI from any subdirectory of your project and bootstrap files at the root will be found normally.

Recognized markers: .git (Git repository) and .agent (explicit ChatCLI marker). Only one needs to exist to define the workspace root.

Example Scenarios

CWD at startup	Marker found	Detected workspace	Files loaded
`~/project/`	`~/project/.git`	`~/project/`	`~/project/SOUL.md`, etc.
`~/project/src/pkg/`	`~/project/.git`	`~/project/`	`~/project/SOUL.md` (walks up 2 levels)
`~/project/src/pkg/`	none	`~/project/src/pkg/`	Global only (`~/.chatcli/`)
`~/monorepo/services/api/`	`~/monorepo/.git`	`~/monorepo/`	`~/monorepo/SOUL.md`, etc.
`~/tmp/`	none	`~/tmp/`	Global only (`~/.chatcli/`)

Detailed Examples

SOUL.md
USER.md
IDENTITY.md
RULES.md
AGENTS.md

Defines personality and tone. Place at ~/.chatcli/SOUL.md (global) or ./SOUL.md (project):

# Personality

You are a technical assistant specialized in software engineering.
Be concise and direct. Prefer practical examples over theoretical explanations.
When suggesting code, use best practices and tests.

# Tone

- Professional but approachable
- Prefer short and objective responses
- Use bullet points for lists
- Default language: English

Defines user and project context. Ideal for ./USER.md in the project directory:

# Project Context

- Stack: Go 1.25, gRPC, Kubernetes
- Database: PostgreSQL 16
- CI/CD: GitHub Actions
- Style: conventional commits, trunk-based development

# Preferences

- Always use tables for comparisons
- Prefer simple solutions without over-engineering
- Tests with idiomatic Go table-driven tests

Defines what the assistant is. Usually global at ~/.chatcli/IDENTITY.md:

# Identity

You are ChatCLI, an intelligent terminal assistant.

## Capabilities

- Code reading and editing via @coder plugin
- Shell command execution with user approval
- Log analysis and error diagnosis
- Git operations (status, diff, log, commit)

## Limitations

- You do NOT have internet access
- You CANNOT install packages without approval
- Your patches may fail if context has changed

Defines strict rules and guardrails. Can be global or per-project:

# Mandatory Rules

1. NEVER execute `rm -rf` without explicit confirmation
2. NEVER commit directly to the main branch
3. ALWAYS run tests after modifying code
4. ALWAYS use conventional commits (feat:, fix:, chore:)

# Security Restrictions

- Do not expose secrets, tokens or API keys in logs
- Do not modify files outside the project directory
- Do not execute commands with sudo

Defines sub-agents and their roles for the orchestrator:

# Custom Agents

## @devops
Infrastructure specialist for Docker, Kubernetes and CI/CD.
Use for deployment, monitoring and pipeline configuration tasks.

## @dba
PostgreSQL database specialist.
Use for queries, optimization, migrations and performance analysis.

## @security
Security auditor focused on OWASP Top 10.
Use for code review with focus on vulnerabilities.

Where to Place Files

“Project root” means the directory containing .git/ or .agent/ — automatically detected by detectProjectDir(), not necessarily the CWD.

# GLOBAL configuration (applies to all projects)
~/.chatcli/SOUL.md
~/.chatcli/IDENTITY.md
~/.chatcli/RULES.md
~/.chatcli/USER.md
~/.chatcli/AGENTS.md

# Per-PROJECT configuration (overrides global)
# "Project root" = directory with .git/ or .agent/
<project-root>/SOUL.md          # In project root
<project-root>/USER.md          # In project root
<project-root>/RULES.md         # In project root (project rules)
<project-root>/IDENTITY.md      # In project root (rare)
<project-root>/AGENTS.md        # In project root (project agents)

CHATCLI_BOOTSTRAP_DIR only overrides the global directory (~/.chatcli/), not the workspace detection. Project-level files (detected via .git or .agent) still take priority over global ones, regardless of CHATCLI_BOOTSTRAP_DIR.

Recommended strategy: SOUL.md and IDENTITY.md global (they’re about the assistant), USER.md and RULES.md per-project (they’re about the work context).

Smart Cache

Bootstrap files use mtime-based (modification time) caching:

On the first read, the content is cached in memory
Subsequent reads check if the mtime has changed
If the file was modified, the cache is automatically invalidated
IsStale() checks if any file has changed since the last load

Persistent Memory

The memory system maintains context across ChatCLI sessions using structured storage with multiple components that learn about you and your work over time.

System Architecture

Conversation -> memoryWorker (3min) -> LLM extraction -> ProcessExtraction()
                                                              |
                    +───────────+───────────+─────────────────+────────────+
                    v           v           v                 v            v
              FactIndex    Profile    TopicTracker    ProjectTracker  DailyNote
              (scored)     (JSON)     (JSON)          (JSON)          (.md)
                    |
                    v
              Compactor (6h check, 24h cycle)
              |-- Level 1: Score-based pruning + archive
              +-- Level 2: LLM consolidation
                    |
                    v
              MEMORY.md (regenerated, never source of truth)

Resilient extraction — nothing is lost silently

Extraction depends on a background LLM call — and a provider outage must not cost the conversation its memory. Three layered defenses:

Fallback chain: extraction tries the session’s active client and, on failure, walks CHATCLI_MEMORY_FALLBACK_PROVIDERS (or CHATCLI_FALLBACK_PROVIDERS), with a per-attempt timeout.
Durable on-disk queue: a segment that fails on every provider is written to ~/.chatcli/memory/pending/ (atomic writes) and retried on later runs, oldest first — it survives restarts. The queue is capped (100 segments) and corrupt files are dropped without wedging the rest.
A visible notice: two consecutive failures print a one-liner in the terminal (memory: extraction failing…) — days of silent fact loss can no longer happen.

The gateway consults this memory too: the daemon’s persona calls @memory recall before answering “I don’t know” to personal questions. See Chat Gateway.

Storage Structure

All memory lives in ~/.chatcli/memory/:

~/.chatcli/memory/
|-- MEMORY.md              # Human-readable summary (regenerated from FactIndex)
|-- memory_index.json      # Facts with relevance scores
|-- user_profile.json      # User profile (name, role, expertise)
|-- topics.json            # Recurring topics with frequency
|-- projects.json          # Active projects with context
|-- usage_stats.json       # Usage patterns and statistics
|-- memory_archive.json    # Archived facts (low score)
|-- weekly/                # Weekly digests (consolidated from daily notes)
|   +-- 2026-W27.md
|-- monthly/               # Monthly digests (kept forever)
|   +-- 2026-06.md
|-- 202603/                # Daily notes for March 2026
|   |-- 20260301.md
|   +-- 20260306.md
+-- 202602/
    +-- 20260228.md

Components

FactIndex -- Long-Term Memory

Replaces the old append-only MEMORY.md. Each fact has:

Unique ID via SHA-256 content hash (automatic deduplication)
Category: architecture, pattern, preference, gotcha, project, personal
Temporal score: (1 + log(accessCount)) * exp(-days * ln2 / halfLife)
Tags for keyword search

Frequently accessed and recent facts get higher scores. Old, never-accessed facts naturally decay.

UserProfile -- User Profile

Automatically detected by the AI during extraction and editable by you/the model in any mode:

Name, role, expertise level, company, location
Preferred language and communication style
Lists with lifecycle: certifications, skills, goals, interests and directives (restating an item supersedes the old one instead of duplicating; _replace/_done/_remove suffixes rewrite)
Directives with severity (hard rules vs preferences) and per-project scope ([scope:<project>] rule is only injected when that workspace is active)
Stances — technical positions recorded with the why ("position :: reason")
Milestones — dated timeline of what happened
Structured Environment (env_os, env_shell, …) — auto-migrated from legacy preferences
Per-field provenance and freshness: each attribute knows whether it came from you or from extraction and when it was last confirmed; fields unconfirmed for 120+ days are flagged as possibly stale in the prompt
Privacy tier: finance/health/family/document keys are auto-tagged [sensitive] — they personalize answers but never enter code, examples or generated artifacts (sensitive_mark/sensitive_unmark for manual control)
Most used commands (top 10) and general preferences

View your profile with /memory profile. Old profiles polluted by append-only versions self-heal on the next load (fragments, progress duplicates and completed goals leave on their own).

TopicTracker -- Recurring Topics

Tracks technical topics discussed:

Mention frequency
Recency (recent topics weigh more)
Links to related facts

View with /memory topics.

ProjectTracker -- Projects

Tracks projects you work on:

Name, path, description
Technologies used
Status (active, paused, completed)
Last activity

View with /memory projects.

PatternDetector -- Usage Patterns

Analyzes how you use ChatCLI:

Total sessions and average duration
Peak activity hours
Preferred features (chat, agent, coder)
Common errors and resolutions

View with /memory stats.

Smart Retrieval

Instead of dumping all memory into the system prompt, ChatCLI uses intelligent retrieval:

Extracts keywords from the last few conversation messages
Searches relevant facts in FactIndex by keyword match + temporal score
Respects a configurable budget (default: 4000 characters)
Prioritizes: Profile > Projects > Topics > Relevant facts > Recent notes > Trajectory (digests) > Usage patterns

Facts accessed by the retriever automatically get their scores bumped, creating a virtuous cycle: the more useful a fact is, the more it appears.

Injection mode: push vs pull

How memory reaches the model in /agent and /coder is controlled by CHATCLI_MEMORY_MODE:

Mode	Behavior
`index` (default)	Injects only a compact, stable index (profile summary + top topic/project names + fact tally by category + episode count) and lets the agent pull detail on demand via `@memory recall`. Bounded, cacheable per-turn cost.
`full`	Injects the full Smart Retrieval (above) every turn — the classic “push” behavior.
`off`	Injects no memory; bootstrap files still apply.

Proactive auto-recall closes the pull model’s gap: models sometimes skip the @memory recall call even when a stored gotcha is exactly what they need. In index mode, each agent/coder turn also ranks the fact index against keywords from the recent messages and rides the top 3 matches into the prompt as a tiny [MEMORY AUTO-RECALL] block (700-byte cap). The block lives in the uncached trailing region next to the wall-clock context, so the stable digest stays byte-identical and prompt caching is unaffected. Gated by CHATCLI_MEMORY_AUTORECALL (on by default). The pull mode (index) shrinks the per-turn memory block by ~88% on a 500-fact store without losing access to detail — see Token Efficiency › Pull-first memory for the full measurement. Chat is tool-less and cannot pull on demand: there index degrades to full. The @memory recall tool uses HyDE + vector search, so pulled detail matches push quality. Check the active mode with /config memory.

Memory Configuration

The memory system has tunable parameters via environment variables:

Variable	Default	Description
`CHATCLI_MEMORY_MODE`	`index`	Injection mode in agent/coder: `index` (pull), `full` (push) or `off`
`CHATCLI_MEMORY_AUTORECALL`	`true`	Proactive top-3 fact injection in `index` mode (see above)
`CHATCLI_MEMORY_MAX_SIZE`	`32768` (32KB)	Max size of rendered MEMORY.md
`CHATCLI_MEMORY_RETENTION_DAYS`	`30`	Daily note retention before cleanup
`CHATCLI_MEMORY_MAX_FACTS`	`500`	Max facts in the FactIndex
`CHATCLI_MEMORY_RETRIEVAL_BUDGET`	`4000`	Max chars of memory injected in system prompt (`full` mode)

In addition to environment variables, the internal Config struct defines additional defaults:

Parameter	Default Value	Description
`CompactionInterval`	24 hours	Minimum interval between full compactions
`DecayHalfLifeDays`	30.0	Temporal decay half-life for fact scores
Check interval	6 hours	How often the system checks if compaction is needed

How Memories Are Created

The background worker now extracts 6 types of information (previously only 2):

DAILY — What was done (files, commands, errors, tasks)
LONGTERM — New facts to remember permanently
PROFILE_UPDATE — Information about the user (name, role, expertise)
TOPICS — Technical topics discussed
PROJECTS — Projects worked on
EPISODES — Dated, durable units of completed work (see Episodic memory)

The worker fires after 4+ new messages with a 2-minute cooldown, and also every 3 minutes during long sessions.

Extraction Process

The Memory Worker follows this internal flow:

EnhancedExtractionPrompt: Sends the recent conversation history to the LLM with a structured prompt requesting information extraction
Expected output: The LLM returns text with well-defined section headers:
- ## DAILY — Summary of what was done in the session
- ## LONGTERM — New facts for long-term memory
- ## PROFILE_UPDATE — User profile updates
- ## TOPICS — Technical topics identified
- ## PROJECTS — Projects mentioned or worked on
ParseEnhancedResponse(): Parses the response and extracts each section individually
Deduplication: Each fact receives a unique ID via SHA-256 content hash. Facts with a hash matching an existing one are automatically discarded
Profile merging with lifecycle: PROFILE_UPDATE changes are merged with the existing profile. List fields upsert (restating an item supersedes the old one instead of duplicating), and extraction may emit rewrite operations: goals_done=/goals_remove= remove finished goals (moving them to milestone=/certifications=), goals_replace= replaces the whole list — the same suffixes work for certifications, skills, interests and directives. Your instructions about the profile (“remove X from my goals”) are applied, never recorded as if they were facts

Automatic Compaction

The system runs periodic compaction to prevent uncontrolled growth:

Check (every 6 hours)

Checks if the fact count exceeds 80% of the limit or if 24h have passed since the last compaction.

LLM Compaction (preferred)

Sends all facts to the AI with instructions to: merge duplicates, remove obsolete, consolidate related. Preserves original metadata.

Score-based Fallback

If the LLM call fails, archives facts with scores below 0.1 to memory_archive.json.

Daily Note Cleanup

Removes notes older than the retention period (default: 30 days). Empty directories are cleaned up.

MEMORY.md Regeneration

Rewrites MEMORY.md from the FactIndex — always up-to-date, never the source of truth.

Weekly and Monthly Digests (Trajectory)

Daily notes expire after ~30 days — rollups preserve the long-range narrative before that:

When an ISO week ends, that week’s daily notes consolidate into weekly/2026-W27.md (LLM summary with a deterministic fallback — never depends on the provider being up).
When a month ends, its weekly digests consolidate into monthly/2026-06.md.
Retention: weeklies keep ~26 weeks; monthlies are kept forever (minimal cost).
A bounded Trajectory section enters the memory context with the latest monthly + recent weeklies — that is how “what you have been doing these past months” survives daily-note cleanup.

The process is idempotent and runs on its own in the memory worker (startup check plus every 12h).

Automatic Migration

When starting for the first time with the new system, ChatCLI detects if a legacy MEMORY.md exists (without memory_index.json) and migrates automatically:

Each line/bullet is converted to an individual fact
Categories are detected from markdown headers
Tags are extracted by technical keywords
The original file is saved as MEMORY.md.bak

`/memory` Command

Subcommand	Description
`/memory` or `/memory today`	Show today’s notes
`/memory yesterday`	Show yesterday’s notes
`/memory 2026-03-04`	Show notes from a specific date
`/memory week`	Show notes from the last 7 days
`/memory longterm`	Show MEMORY.md content
`/memory list`	List all memory files (includes structured JSONs)
`/memory load <date>`	Load a day’s notes into conversation context
`/memory profile`	Show detected user profile
`/memory profile set <field>=<value>`	Set/update a profile field manually
`/memory remember <fact>`	Explicitly add a long-term fact (accepts a `[category]` prefix)
`/memory forget <substring>`	Remove long-term facts containing the substring
`/memory topics`	Show tracked topics with frequency
`/memory projects`	Show tracked projects with status
`/memory stats`	Full statistics (sessions, peak hours, errors, features)
`/memory facts [category]`	List facts with scores (filter by category)
`/memory timeline [q]`	Dated episode history; `q` filters by time (“3 meses atrás”, “abril”, “2026-04”) and/or content
`/memory compact`	Force immediate compaction (LLM + note cleanup)

Manual editing and extended profile

Automatic detection doesn’t always catch everything, so you can edit memory explicitly. Beyond name/role/expertise/company/location, the profile covers certifications, skills, goals, interests, directives, stances, milestones and environment (env_*). List fields have a lifecycle, they are not append-only: new items enter, a restated item (same text apart from status/parenthetical) supersedes the old one in place, and operation suffixes rewrite — _replace overwrites the whole list (empty value clears it), _done/_remove delete matching items. Commas inside parentheses are safe ("Quiz X (Provider, 60 questões)" stays whole).

> /memory profile set company=ACME Corp
> /memory profile set certifications=CKA, AWS SAA        # upsert, deduplicated
> /memory profile set goals_done=earn the CKA            # finished goal leaves the list
> /memory profile set milestone=Earned the CKA           # ...and becomes a dated milestone
> /memory profile set goals_replace=launch product Y     # replaces ALL goals
> /memory profile set stance=prefer keyless backends :: less setup friction
> /memory profile set directives=[scope:my-repo] always run the linter before pushing
> /memory profile set env_shell=zsh
> /memory profile set sensitive_mark=monthly_income      # marks as private
> /memory remember [preference] Prefers Go over Python for CLIs
> /memory forget Python                                  # removes FACTS containing "Python"

forget only acts on facts; to remove profile items use the suffixes (goals_remove=...). Directives with [scope:<project>] are only injected when that workspace is active; without the tag they apply globally.

`@memory` tool (agent, coder and chat)

Inside /agent and /coder, the model can persist and explore memory on its own via the @memory tool (cmds remember, profile, forget, recall, timeline, neighbors, map):

<tool_call name="@memory" args='{"cmd":"remember","args":{"content":"User earned the AWS Solutions Architect certification","category":"personal"}}' />

So when you tell the agent something new (e.g. a fresh certification), it records it into your profile/long-term facts without you running /memory manually. The profile subcommand accepts every lifecycle operation (goals_done, goals_replace, stance, milestone, env_*, sensitive_mark, …). In chat too: memory/profile updates are the fourth sanctioned exception of tool-less chat (alongside ask_user, read-only knowledge and @graphview). When you reveal or correct a durable fact about yourself, the model persists it in the same turn — no more “I’ll keep that in mind” without writing anything. Gated by CHATCLI_CHAT_MEMORY (on by default), flippable with /config chat memory on|off.

Episodic memory: the work timeline

Facts capture what is true; episodes capture what happened. An episode is a dated, durable unit of completed work — summary, outcome and refs (files, PRs, commands) — extracted automatically by the background worker (## EPISODES section) and stored forever in ~/.chatcli/memory/episodes.json. Unlike daily notes (30-day retention) and their digests, episodes never expire, so “what did we do three months ago?” finally has a structured answer.

Storage follows the fact-index disciplines: atomic writes, quarantine on corruption, and reconciling multi-process rewrites (REPL, gateway daemon and MCP server never erase each other). The same work item re-extracted on the same day dedups and enriches outcome/refs instead of duplicating. Capped at 2000 episodes (oldest dropped first).
@memory timeline is the model-facing chronological view: {project?, from?, to?, query?, limit?}. from/to accept ISO dates (2026-04, 2026-04-12) or natural expressions in English and Portuguese (“3 months ago”, “há 3 meses”, “abril”, “last week”) — a time expression inside query works too.
Temporal recall: a @memory recall query carrying a time expression (“o que fizemos em abril?”) automatically prepends the matching timeline slice to the result — relevance ranking alone cannot answer “when”. When the window came from natural language and the leftover words match nothing, the window wins and returns unfiltered (the date was the real signal).
/memory timeline [q] is the human view — same temporal parsing, rendered in the terminal.
Rollups are grounded in episodes: weekly/monthly digests now lead with the period’s episodes and use daily notes as color, so the long-range narrative stops decaying — and a week whose notes already expired still gets its digest from episodes alone.

<tool_call name="@memory" args='{"cmd":"timeline","args":{"query":"3 months ago"}}' />
<tool_call name="@memory" args='{"cmd":"timeline","args":{"project":"chatcli","from":"2026-04","to":"2026-06"}}' />

Knowledge graph (`@memory neighbors` / `map`)

What ChatCLI knows about you is not a flat list: facts, topics, projects, skills and tags form an on-demand graph (Obsidian-style, in the core) derived from the relationships the stores already record (topic↔fact, fact→project, tags, skill triggers) plus [[wikilinks]] in note text. Access is through @memory itself:

recall → search by content (“which facts match these words?”).
neighbors <subject> → local graph: backlinks + related notes for a subject (“what connects to this?”).
map → overview (counts by kind + hubs).

Context discipline: only a tiny, deterministic index card rides each turn (cache-friendly); depth is pulled on demand. To visualize, use /graph (renders the graph to an image via embedded go-graphviz).

Fact quality (confidence, provenance, reconciliation)

Each fact carries confidence and provenance: what you state directly outranks a background-extraction guess, and a re-observed fact climbs — confidence scales the score (ranking and survival against decay/pruning). On write, ChatCLI reconciles: a rephrasing reinforces the existing fact instead of duplicating, and a same-subject update of equal-or-higher confidence supersedes the stale one (conservatively — a weak guess never wipes a strong fact). Topics gain a rolling summary of what was discussed, becoming real knowledge nodes. Legacy memory indexes are enriched once on startup, without losing anything.

What goes in each component?

FactIndex: Stable, long-lasting facts — decisions, patterns, gotchas, preferences
UserProfile: Who you are — name, role, expertise, language
TopicTracker: What you talk about — Go, Docker, K8s, etc.
ProjectTracker: What you work on — chatcli, my-app, etc.
PatternDetector: How you work — schedules, features, common errors
Daily notes: What happened today — temporal and specific

Can I edit memories manually?

Yes! All files are plain JSON or Markdown:

# View profile
cat ~/.chatcli/memory/user_profile.json | jq .

# View facts with scores
cat ~/.chatcli/memory/memory_index.json | jq '.[0:5]'

# Edit today's note
vim ~/.chatcli/memory/$(date +%Y%m)/$(date +%Y%m%d).md

JSON changes are loaded on next startup. MEMORY.md is regenerated and should not be edited directly.

How does fact scoring work?

Each fact has a score calculated by:

score = (1 + log(1 + accessCount)) * exp(-daysSinceAccess * ln(2) / halfLifeDays)

accessCount: How many times the fact was used by the retriever
daysSinceAccess: Days since last access
halfLifeDays: Decay half-life (default: 30 days)

Frequently and recently accessed facts get high scores. Never-accessed facts decay to ~0 after 3-4 half-lives.

What Gets Injected into the Prompt

The ContextBuilder assembles the following block and injects it as a system prompt prefix:

## AGENTS.md

[content of AGENTS.md]

---

## SOUL.md

[content of SOUL.md]

---

## USER.md

[content of USER.md]

---

## IDENTITY.md

[content of IDENTITY.md]

---

## RULES.md

[content of RULES.md]

---

# Memory

## Long-term Memory

[content of MEMORY.md]

## Recent Daily Notes

### 2026-03-04

[content of March 4th note]

### 2026-03-05

[content of March 5th note]

### 2026-03-06

[content of today's note]

Empty sections (missing files) are automatically omitted — only what exists is injected.

Configuration

Environment Variables
Via Helm Chart

CHATCLI_BOOTSTRAP_ENABLED=true
CHATCLI_BOOTSTRAP_DIR=/path/to/bootstrap/files
CHATCLI_MEMORY_ENABLED=true

Variable	Default	Description
`CHATCLI_BOOTSTRAP_ENABLED`	`true`	Enable/disable bootstrap file loading
`CHATCLI_BOOTSTRAP_DIR`	`~/.chatcli/`	Alternative directory for global bootstrap files. Use this when you want to keep your files (SOUL.md, RULES.md, etc.) in another location, such as a versioned repository or a directory shared across machines
`CHATCLI_MEMORY_ENABLED`	`true`	Enable/disable the persistent memory system

CHATCLI_BOOTSTRAP_DIR only overrides the global directory (~/.chatcli/). Project-level files (detected via .git or .agent markers) still take priority over global ones.

# values.yaml
bootstrap:
  enabled: true
  definitions:
    SOUL.md: |
      You are a DevOps assistant...
    USER.md: |
      The user prefers Go...

memory:
  enabled: true
  # Uses the persistence PVC by default

The Helm chart creates ConfigMaps for bootstrap files and mounts them at /home/chatcli/.chatcli/bootstrap/. Memory uses the sessions PVC for persistence.

Context Injection Optimization (Prompt Caching)

ChatCLI optimizes token costs when contexts are attached using three complementary strategies:

Unified System Prompt with Cache Hints

Contexts attached via /context attach are injected as system prompt, not as user messages. This enables provider-level prompt caching:

Provider	Mechanism	Discount
Anthropic	`cache_control: ephemeral`	~90%
OpenAI	Automatic prompt caching	~50%
Google	Context caching API	Variable

The system prompt block contains:

Bootstrap (SOUL.md, USER.md, etc.)
Memory (MEMORY.md + daily notes)
Attached Contexts (new — previously injected as user messages)
K8s Watcher (if active)

Since the system prompt is identical across turns, the provider caches it and charges tokens at a discount.

Smart Compaction

Injected context messages (/memory load, summarized contexts) are automatically truncated during compaction (Level 1 — trimming). This prevents old reference context from consuming valuable token budget.

Token Visibility

The /context attached command now shows:

Estimated tokens per context
Total tokens per turn
Cache hints per provider
Warnings for oversized contexts

When running /context attach, the feedback includes the estimated cost per turn.

Best Practices

Global SOUL.md, per-project USER.md

Keep your preferred personality globally and technical context per project.

Keep MEMORY.md concise

Keep only stable and confirmed facts — not session-specific ones.

Daily notes for journaling

Use them to record decisions, solved problems, and temporal context.

Don't duplicate CLAUDE.md

If you already use CLAUDE.md or project instructions, avoid duplicating them in bootstrap.

Periodically review your memories and remove outdated ones to keep the context relevant.

​Bootstrap Files

​Supported Files

​Loading Priority

​Automatic Workspace Detection

​Example Scenarios

​Detailed Examples

​Where to Place Files

​Smart Cache

​Persistent Memory

​System Architecture

​Resilient extraction — nothing is lost silently

​Storage Structure

​Components

​Smart Retrieval

​Injection mode: push vs pull

​Memory Configuration

​How Memories Are Created

​Extraction Process

​Automatic Compaction

​Weekly and Monthly Digests (Trajectory)

​Automatic Migration

​/memory Command

​Manual editing and extended profile

​@memory tool (agent, coder and chat)

​Episodic memory: the work timeline

​Knowledge graph (@memory neighbors / map)

​Fact quality (confidence, provenance, reconciliation)

​What Gets Injected into the Prompt

​Configuration

​Context Injection Optimization (Prompt Caching)

​Unified System Prompt with Cache Hints

​Smart Compaction

​Token Visibility

​Best Practices

Global SOUL.md, per-project USER.md

Keep MEMORY.md concise

Daily notes for journaling

Don't duplicate CLAUDE.md

​Next Steps

Conversation Control

Sessions

Bootstrap Files

Supported Files

Loading Priority

Automatic Workspace Detection

Example Scenarios

Detailed Examples

Where to Place Files

Smart Cache

Persistent Memory

System Architecture

Resilient extraction — nothing is lost silently

Storage Structure

Components

Smart Retrieval

Injection mode: push vs pull

Memory Configuration

How Memories Are Created

Extraction Process

Automatic Compaction

Weekly and Monthly Digests (Trajectory)

Automatic Migration

`/memory` Command

Manual editing and extended profile

`@memory` tool (agent, coder and chat)

Episodic memory: the work timeline

Knowledge graph (`@memory neighbors` / `map`)

Fact quality (confidence, provenance, reconciliation)

What Gets Injected into the Prompt

Configuration

Context Injection Optimization (Prompt Caching)

Unified System Prompt with Cache Hints

Smart Compaction

Token Visibility

Best Practices

Next Steps