HyDE is opt-in (
CHATCLI_QUALITY_HYDE_ENABLED=true) to keep the steady-state with no additional cost. Phase 3a costs +1 cheap LLM call; Phase 3b requires configuring an embedding provider.The problem HyDE solves
The pre-pipelinememory.Fact retrieval was keyword-only: the scorer matches tokens extracted from recent messages against tags and content of stored facts. Works well when vocabulary matches exactly — fails when the user uses synonyms or asks abstract questions.
Gap example:
- Without HyDE
- With HyDE 3a
- With HyDE 3b
User:
Extracted keywords:
Stored fact:
Match: ❌ — “do” and “go” don’t literally appear in the fact.
how to do X in Go?Extracted keywords:
[do, go]Stored fact:
"use goroutines for concurrency in X pipelines"Match: ❌ — “do” and “go” don’t literally appear in the fact.
Phase 3a — Hypothesis-based keyword expansion
LLM generates short hypothesis
Prompt: “Write a 2-4 sentence plausible answer that uses the technical nouns that would appear in any matching note. Bilingual if the query mixes languages.”
ExtractKeywords from the hypothesis
The same extractor already used in chat mode (en+pt stop words, min 3 chars).
Merge unique + lower-case
Original keywords + top-N from hypothesis, cap configurable via
CHATCLI_QUALITY_HYDE_NUM_KEYWORDS (default 5).Phase 3b — Vector embeddings
Adds cosine similarity search over fact embeddings.Architecture
Supported providers
- Voyage (recommended)
- OpenAI
- Null (default)
voyage-3, 1024-dim) is the general-purpose sweet spot.Pure-Go vector store
No CGO, no SQLite-vec, no external deps. Just
float32[] + cosine + JSON persistence in ~/.chatcli/memory/vector_index.json.N < 1000 facts (typical chatcli case), linear in-memory search completes in microseconds. No need for HNSW or IVFFlat indexing.
Dimension lock
Switching provider (Voyage 1024 → OpenAI 1536) is not automatic: the store rejects with an explanatory error. Reason: cosine between vectors of different dimensions is mathematically invalid.Lazy backfill
When retrieving a fact, if it has no vector (fact predates embeddings activation), the index spawns a detached goroutine to embed the top-25 visible facts:Full configuration
| Env var | Default | Effect |
|---|---|---|
CHATCLI_QUALITY_HYDE_ENABLED | false | Master switch (phase 3a) |
CHATCLI_QUALITY_HYDE_USE_VECTORS | false | Enable phase 3b (requires provider) |
CHATCLI_QUALITY_HYDE_PROVIDER | — | Display-only provider name |
CHATCLI_QUALITY_HYDE_NUM_KEYWORDS | 5 | Hypothesis keyword cap in phase 3a |
CHATCLI_EMBED_PROVIDER | — | voyage|openai|null |
CHATCLI_EMBED_MODEL | provider default | E.g. voyage-3, text-embedding-3-small |
CHATCLI_EMBED_DIMENSIONS | provider default | OpenAI only |
/config quality surfaces state
Integration with Reflexion
HyDE amplifies Reflexion’s value: lessons persisted by #3 are retrieved with much higher recall when the next task doesn’t use the exact same keywords. Workflow:Turn 1: auth.go refactor fails (timeout)
Reflexion persists lesson:
"use Edit tool for large files", tags [go, refactor, edit-tool].Turn 5 (days later): 'help me split pkg/engine'
Query doesn’t contain
refactor or edit. Keyword-only would miss the lesson.HyDE 3a generates hypothesis
"To split a Go package, identify logical groupings and use refactor patterns with Edit tool for surgical changes..."Extracted keywords:
[split, package, refactor, edit, patterns, …]Caveats and tuning
Graceful fallback: if the LLM fails or the embedding provider returns an error, retrieval falls back to keyword-only silently. No turn is aborted by HyDE failure.
See also
#3 Reflexion
The lessons that HyDE retrieves with higher recall.
Bootstrap Memory
The layer underneath: how memory.Fact is populated and maintained.
Persistent Context
/context attach for explicit file contexts.Full configuration
All envs and slashes.