Skip to main content
ChatCLI includes two native web tools — @webfetch and @websearch — that allow the agent to search for information on the internet and fetch web pages without depending on external MCP servers.
Web tools are native ChatCLI tools, automatically available in agent and coder modes. No MCP server configuration is required to use them.

@webfetch

Fetches a URL, strips the HTML, and returns the clean text content. Ideal for reading documentation, articles, READMEs, and any web page.

How It Works

1

HTTP request

ChatCLI makes a GET request to the provided URL with standard browser headers.
2

HTML parsing

The received HTML is parsed using golang.org/x/net/html, extracting only the text content.
3

Cleanup

Script, style, navigation tags and non-text elements are removed. The resulting text is cleaned and formatted.
4

Return

The text content is returned to the agent as the tool call result.

Usage

The LLM invokes @webfetch automatically when it needs to access content from a URL. You can also request it explicitly:
Read the documentation at https://pkg.go.dev/net/http and explain the Client type to me

Argument Formats

{
  "tool": "webfetch",
  "args": {
    "url": "https://pkg.go.dev/net/http"
  }
}

Example

User: Fetch the content from https://go.dev/blog/error-handling-and-go

Agent: I'll fetch the page content.

[tool_call: webfetch {"url": "https://go.dev/blog/error-handling-and-go"}]

Result: The article "Error handling and Go" explains Go's error
handling patterns, including...
@webfetch respects redirects (up to 10), timeouts (30s), and returns clear errors for inaccessible URLs or SSL issues.

Filters for large payloads

Endpoints like Prometheus /metrics, configuration dumps or long listings can easily exceed tens of thousands of characters. @webfetch accepts a set of parameters that perform line-level filtering before truncation, so the useful part is not discarded:
ParameterTypeDescription
filterstring (Go regex)Keep only lines matching the regex. Applied before exclude and from_line/to_line.
excludestring (Go regex)Drop lines matching the regex. Applied after filter.
from_lineintegerStart of the window in the filtered view (1-based, inclusive).
to_lineintegerEnd of the window in the filtered view (1-based, inclusive).
save_to_filebooleanPersist the full pre-filter body to the session scratch dir and return preview + absolute path. Triggered automatically when the body exceeds CHATCLI_WEBFETCH_AUTOSAVE_BYTES (default 10000) AND no filter/range is set.
save_pathstringOverride the generated filename for save_to_file (any directory prefix is discarded — writes are always confined to the scratch dir).
max_lengthintegerMaximum inline content length (default: 20,000). Content above this is truncated inline — or auto-saved to the scratch dir via auto-save.
renderbooleanForce (true) or suppress (false) headless rendering of JS pages — see Rendering JavaScript pages. Without the parameter, auto mode decides via heuristics.
Example — filter Prometheus by metric prefix:
{
  "url": "http://payments.prod.svc:9090/metrics",
  "filter": "^chatcli_",
  "exclude": "^# HELP|^# TYPE"
}
Example — page within the filtered payload:
{
  "url": "http://svc/very-long-changelog",
  "filter": "^- ",
  "from_line": 50,
  "to_line": 80
}
Example — save everything to the scratch dir and read slices on demand:
{
  "url": "http://svc/metrics",
  "save_to_file": true
}
Response:
[full response saved to /tmp/chatcli-agent-Xy7K3a/scratch/webfetch_1712...txt — 142,318 bytes.
Use read_file with start/end to examine specific ranges.]

[first ~50K chars of content]
The agent can then issue a read_file against the absolute path returned, choosing the exact line range that matters.
save_to_file always confines writes to CHATCLI_AGENT_TMPDIR. If save_path is an absolute path or contains .., only filepath.Base is used and the write is validated to ensure the result stays within the scratch dir.

Smart auto-save

When the LLM calls @webfetch without filter, exclude, range or an explicit save_to_file, and the returned body exceeds the auto-save threshold, ChatCLI automatically promotes the call to save_to_file=true. This shields the context from giant pages without requiring the model to know the body size in advance. Default: bodies above 10,000 bytes (configurable via CHATCLI_WEBFETCH_AUTOSAVE_BYTES) trigger the auto-save. The inline result is a compact preview (~5,000 chars), and the response opens with an explicit marker:
[auto-saved: response was 142318 bytes — too large to inline.
 Full body is at /tmp/chatcli-agent-.../scratch/webfetch_1712....txt.
 Preview below; use read_file with start/end or rerun with
 filter/from_line/to_line for specific ranges.]

[first ~5000 chars of extracted text]
...(auto-truncated — full body saved to disk)
To disable or loosen auto-save — e.g. offline batches where the agent needs the whole body inline — raise the threshold:
export CHATCLI_WEBFETCH_AUTOSAVE_BYTES=1000000   # 1 MB — effectively disabled for most pages
On a per-call basis, passing any filter (even .*), explicit from_line/to_line, or save_to_file=false disables the automatic promotion.
See Token Efficiency for the full rationale behind this default.

Rendering JavaScript pages (SPA)

Pages that build their content client-side (SPAs, JS-rendered tables) return an empty “shell” on a static fetch — <div id="root"> plus bundles, no actual content. @webfetch solves this with an escalation chain:
1

Static fetch

Always the first step. Server-rendered pages stop here — zero extra cost.
2

JS-shell detection

Heuristics: thin extracted text + structural signals (empty #root/#app mount points, <noscript> warnings, framework markers — React, Next, Angular, Vue, Nuxt, Svelte, Gatsby, Remix, Flutter).
3

Headless render via CDP

A real Chromium renders the page (waits for load + DOM stability) and the settled DOM flows through the same extraction/filter/auto-save pipeline.
4

Browserless fallbacks

Without a browser, the embedded __NEXT_DATA__ state (Next.js) is recovered from the static HTML; as a last resort an honest note tells the model the fetch may be incomplete.
Browser discovery (in order): Chrome → Chromium → Edge → Brave → explicit path in CHATCLI_WEBFETCH_RENDER_BROWSER → opt-in download of a pinned Chromium (~150 MB, once) with CHATCLI_WEBFETCH_RENDER_AUTOPROVISION=true. No API keys, no external services. Production posture: one shared browser per process (lazy launch, health-checked reuse, shutdown after 2 idle minutes), a circuit breaker (2 launch failures → 5-minute pause), an incognito context per render (cookies never leak between sites) and SSRF enforced inside the browser — every sub-request the page fires is validated through CDP interception, mirroring the regular HTTP path guard. Rendered DOM capped at 10 MB.
VariableDescriptionDefault
CHATCLI_WEBFETCH_RENDERauto (heuristics decide), always, neverauto
CHATCLI_WEBFETCH_RENDER_TIMEOUTRender timeout in seconds25
CHATCLI_WEBFETCH_RENDER_BROWSERAbsolute path to a specific Chromium-based binary(auto-detect)
CHATCLI_WEBFETCH_RENDER_AUTOPROVISIONAllows the one-time download of a pinned Chromium when no browser existsfalse
> @webfetch https://app-spa.example.com/dashboard
Page appears JS-rendered; escalating to headless browser...

@websearch

Performs a web search and returns results with title, URL and snippet. Supports two keyless backends by design — no third-party API key to register: DuckDuckGo (HTML scraping) is the zero-config default, and self-hosted SearxNG is preferred in corporate environments when you point to an internal instance.

Available backends

BackendRequiresWhen it shinesPain points
DuckDuckGoNothingDefault, works out of the box, zero configDDG occasionally serves anti-bot interstitials (CAPTCHA) — may return empty results
Self-hosted SearxNGSEARXNG_URL pointing to your instanceLocked-down corporate networks — you control the backend, no egress to public scraping, aggregates several engines (Bing/Google/Qwant) through the instanceRequires running an internal container + enabling JSON in settings.yml
Brave SearchNothingIndependent index (not a meta-search) — real diversity when DDG blocksHTML scraping; layout may shift (parser anchored on stable semantic attributes)
MojeekNothingIndependent index, UK-based crawlerSome networks receive a 403 for automated traffic — the chain simply moves on

Fallback chain

For each query, ChatCLI builds an ordered chain of backends. If the first one fails or returns empty, the next one is tried automatically. Default order (CHATCLI_WEBSEARCH_PROVIDER unset or auto):
1. DuckDuckGo          ← default, always available
2. SearxNG             ← only added to the chain when SEARXNG_URL is set
3. Brave Search        ← independent index, zero config
4. Mojeek              ← independent index, zero config
Explicit override to prefer SearxNG:
export CHATCLI_WEBSEARCH_PROVIDER=searxng
export SEARXNG_URL=https://searx.internal.corp
Result: the chain becomes searxng → duckduckgo → brave → mojeek. The others remain as fallbacks if the SearxNG instance fails. The same applies to any provider: CHATCLI_WEBSEARCH_PROVIDER=brave moves Brave to the front and keeps the rest behind it.

Environment variables

VariableDescriptionDefault
CHATCLI_WEBSEARCH_PROVIDERForce a specific backend to the top of the chain: searxng, duckduckgo, brave, mojeek, or auto.auto
SEARXNG_URLRoot URL of the SearxNG instance (e.g. https://searx.internal.corp). When set, SearxNG joins the chain.(unset)

/websearch command

Interactive manager for the preferred backend. Autocomplete available for subcommands and provider names.
SubcommandEffect
/websearch or /websearch statusShow current provider + active chain
/websearch listList known providers and which are configured
/websearch provider <searxng|duckduckgo|brave|mojeek|auto>Set preferred provider for the session (sets CHATCLI_WEBSEARCH_PROVIDER in the process)
/websearch resetRemove the override and return to auto mode
/websearch provider applies only to the current session. To persist, export the env in your shell or add it to .env.

Configuring self-hosted SearxNG

The SearxNG instance must have its JSON API enabled — it isn’t on by default. In the SearxNG settings.yml:
search:
  formats:
    - html
    - json
If you point SEARXNG_URL at an instance without JSON enabled, ChatCLI returns an actionable error instead of a cryptic decode failure:
SearxNG did not return JSON (Content-Type="text/html"). Enable JSON in settings.yml: search.formats: [html, json]
The official searxng/searxng Docker Hub image boots in 30 seconds. In a corporate environment, a single container with internal ingress is enough — and it solves the “DDG blocked by the proxy” problem once and for all.

How it works internally

1

Chain selection

SelectSearchChain() reads CHATCLI_WEBSEARCH_PROVIDER and SEARXNG_URL, returns an ordered list of backends to try.
2

Sequential attempt

For each backend in the chain: call the search function. If results come back, stop and format. If it fails or returns zero, log the reason and advance to the next.
3

Formatting

Results become formatted text with via <provider> in the header, numbered with title + URL + snippet.

Argument Formats

{
  "tool": "websearch",
  "args": {
    "query": "golang rate limiting best practices 2026"
  }
}

Example

User: Search how to set up OpenTelemetry with Go

Agent: I'll search for up-to-date information on that.

[tool_call: websearch {"query": "opentelemetry go setup tutorial"}]

Search results for: "opentelemetry go setup tutorial" (via DuckDuckGo)

1. Getting Started with OpenTelemetry in Go
   URL: https://opentelemetry.io/docs/languages/go/getting-started/
   Official guide to instrumenting Go applications with OpenTelemetry...

2. OpenTelemetry Go SDK - Complete Guide
   URL: https://example.com/otel-go-guide
   Step-by-step tutorial covering traces, metrics and logs...
The (via DuckDuckGo) or (via SearxNG) header makes it clear which backend responded — useful for diagnosing why a query returned nothing (e.g. DDG served a CAPTCHA → falls back to SearxNG).
Why keyless? ChatCLI is used in corporate environments where managing third-party API keys (Brave Search, Tavily, SerpAPI) creates operational friction — registration, rotation, approvals. Self-hosted SearxNG solves network lockdown without recurring cost; DuckDuckGo covers casual use without any config.

Comparison

Aspect@webfetch@websearch
PurposeRead content from a specific URLSearch the web for a query
InputURLSearch query
OutputClean text from the pageList of results (title + URL + snippet)
When to useYou know the exact URLYou need to find information
EngineHTTP GET + HTML parserDuckDuckGo (HTML scraping) + SearxNG (JSON API)

Availability

Web tools are available in the following modes:
Mode@webfetch@websearch
ChatNoNo
Agent (/agent)YesYes
Coder (/coder)YesYes
One-shot (-p)Yes (with --agent)Yes (with --agent)
In interactive chat mode, web tools are not available. You need to be in agent or coder mode for the LLM to invoke them as tool calls.

Next Steps

MCP Integration

Integrate additional web tools via MCP servers.

Agentic Plugins

See all tools available to the agent.