Skip to main content
ChatCLI supports AWS Bedrock as a native provider (BEDROCK) with three dispatch paths that cover the entire AWS-hosted catalog:
  • Anthropic Messagesanthropic.* and inference profiles (global./us./eu./apac.anthropic.*). Preserves cache markers and extended-thinking budget.
  • OpenAI Chat Completionsopenai.gpt-oss-* (OpenAI’s open-weights on Bedrock).
  • Converse API (default) — AWS’s unified schema covering everything else: Llama, Amazon Nova, Mistral, Cohere, AI21 Jamba, DeepSeek, Stability, Writer Palmyra, Moonshot Kimi, MiniMax, Qwen, Z.AI/GLM, Google Gemma, NVIDIA Nemotron, TwelveLabs Pegasus, and any provider AWS onboards next.
The /switch --model listing trusts AWS-side responses from ListFoundationModels + ListInferenceProfiles 100% — there is no hardcoded allowlist. A new model on AWS shows up on the next /switch --model without a ChatCLI release. Ideal for corporate environments that already manage billing, compliance, and access control through AWS — no need for API keys from the original providers.

Why AWS Bedrock?

No per-provider API key

Uses existing AWS credentials (IAM role, ~/.aws/credentials, AWS_PROFILE). Single identity across every model.

AWS billing and compliance

Costs appear on your AWS bill. CloudTrail logs, native Bedrock guardrails.

Full catalog

Anthropic, OpenAI, Llama, Nova, Mistral, Cohere, AI21, DeepSeek, Moonshot Kimi, MiniMax, Qwen, Z.AI/GLM, Gemma, Nemotron, TwelveLabs — all under one account.

VPC endpoints

Works in private environments via AWS_ENDPOINT_URL_BEDROCK_RUNTIME.

Auto-detected family

Anthropic and OpenAI use dedicated paths (cache, thinking); the rest goes through Converse — one call covers all.

Native embeddings

Embeddings provider reuses the same AWS credential chain. Titan v1/v2 + Cohere v3. See RAG + HyDE.

Configuration

The provider is auto-detected when ChatCLI finds valid AWS credentials (not just file existence):
  • Static creds in env: AWS_ACCESS_KEY_ID
  • Profile selection: AWS_PROFILE (via env var or .env file)
  • ~/.aws/credentials file with at least one non-empty aws_access_key_id
  • AWS SSO: SSO profile in ~/.aws/config (detects sso_session, sso_start_url, sso_account_id)
  • Assume-role / credential_process: profiles with role_arn or credential_process in ~/.aws/config
  • SSO token cache: presence of files in ~/.aws/sso/cache/ (indicating a prior aws sso login)
  • Web Identity Token (EKS IRSA): AWS_WEB_IDENTITY_TOKEN_FILE
  • Container Credentials (ECS): AWS_CONTAINER_CREDENTIALS_RELATIVE_URI / _FULL_URI
The mere existence of ~/.aws/config with only region or output does not activate Bedrock. The file must contain credential configuration (SSO, assume-role, credential_process), or credentials must exist in another source.

Option 1: ~/.aws/credentials (static credentials)

If you already use AWS CLI, just have a profile configured:
# ~/.aws/credentials
[default]
aws_access_key_id = AKIA...
aws_secret_access_key = ...

[corp-prod]
aws_access_key_id = AKIA...
aws_secret_access_key = ...
export AWS_PROFILE=corp-prod
export BEDROCK_REGION=us-east-1   # optional, defaults to us-east-1
chatcli
Inside ChatCLI:
/switch BEDROCK
You can also set AWS_PROFILE in your .env file instead of exporting in the shell:
AWS_PROFILE=corp-prod
BEDROCK_REGION=us-east-1
LLM_PROVIDER=BEDROCK
ChatCLI reads the .env via godotenv and resolves the profile correctly.

Option 2: AWS SSO (IAM Identity Center)

If your company uses AWS SSO, configure the profile in ~/.aws/config:
[profile my-sso]
sso_session = my-session
sso_account_id = 123456789012
sso_role_name = MyRole
region = us-east-1

[sso-session my-session]
sso_start_url = https://my-company.awsapps.com/start
sso_region = us-east-1
# Log in (opens browser)
aws sso login --profile my-sso

# Use with ChatCLI (any of these):
export AWS_PROFILE=my-sso && chatcli
AWS_PROFILE=my-sso chatcli

# Or in .env:
echo 'AWS_PROFILE=my-sso' >> .env
chatcli
ChatCLI automatically detects SSO profiles in ~/.aws/config (via sso_session, sso_start_url, sso_account_id keys). If the SSO token expires, the error will be clear (SSOTokenProviderError) — just run aws sso login again.Important: the AWS SDK does not know which profile is “logged in”. You must indicate the profile via AWS_PROFILE (env, .env, or flag). If your SSO profile is named default, it is used automatically without AWS_PROFILE.

Option 3: Environment variables (static credentials)

export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_SESSION_TOKEN=...      # if using STS
export AWS_REGION=us-east-1

Option 4: IAM Role (EC2/ECS/EKS)

On AWS-native environments, nothing to configure — the SDK picks up the role automatically through IMDSv2 / webidentity. Just make sure the role has the IAM permissions below.
ChatCLI disables the IMDS probe (169.254.169.254) by default on machines that are not EC2/ECS/EKS, to avoid unnecessary timeouts. IMDS is automatically enabled when container/EKS env vars are detected (AWS_CONTAINER_CREDENTIALS_*, AWS_WEB_IDENTITY_TOKEN_FILE, ECS_CONTAINER_METADATA_URI*).To force behavior, use:
  • AWS_EC2_METADATA_DISABLED=true — explicitly disable IMDS
  • CHATCLI_BEDROCK_ENABLE_IMDS=1 — force enable IMDS (useful on EC2 without standard env vars)

IAM Permissions

Minimum permissions to invoke and list models. The bedrock:InvokeModel action covers both InvokeModel (Anthropic/OpenAI) and Converse (everything else):
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:*::foundation-model/*",
        "arn:aws:bedrock:*:*:inference-profile/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:ListFoundationModels",
        "bedrock:ListInferenceProfiles"
      ],
      "Resource": "*"
    }
  ]
}
To restrict to specific providers, swap the Resource ARNs for a list (e.g. arn:aws:bedrock:*::foundation-model/anthropic.*, arn:aws:bedrock:*::foundation-model/moonshotai.*). Remember to include the matching inference profile ARNs (*:inference-profile/*anthropic.* etc.) — otherwise Claude 3.7+ and equivalents from other providers stop working.
ListFoundationModels and ListInferenceProfiles are used by /switch --model to dynamically discover what your account can invoke. Without them, ChatCLI falls back to the static catalog (still functional but can’t reflect account-specific access).
Also, in the Bedrock console you must enable model access for each provider you want to use (one-time per account + region): Bedrock Console → Model access → Request access.

Model families and schema selection

Bedrock uses different schemas depending on the model. ChatCLI has three dispatch paths and auto-detects which one to use from the model-id prefix:
Model id prefixFamilySchemaWhy
anthropic.*, global./us./eu./apac.anthropic.*Anthropic Messagesanthropic_version, messages, system (with cache_control)Preserves cache breakpoints, extended-thinking budget, every Claude knob
openai.*, us.openai.*, etc.OpenAI Chat Completionsmessages, max_completion_tokensStable schema with broad coverage for GPT-OSS
Anything else (Llama, Nova, Mistral, Cohere, AI21, DeepSeek, Stability, Writer, Moonshot, MiniMax, Qwen, Z.AI, Gemma, Nemotron, TwelveLabs, …)Converse API (default)messages, system, inferenceConfigAWS-unified schema — one implementation covers every provider that doesn’t need special features

Manual override

To force a family regardless of the prefix (e.g. test Converse on an Anthropic model), use the env var:
export BEDROCK_PROVIDER=anthropic   # or "claude"
export BEDROCK_PROVIDER=openai      # or "gpt"
export BEDROCK_PROVIDER=converse    # or "auto"
Accepted values: anthropic / claude, openai / gpt, converse / auto (case-insensitive). The env var takes precedence over prefix detection.
Why do Anthropic and OpenAI stay out of Converse? Anthropic’s cache_control breakpoints and extended-thinking knobs map onto Converse with a different shape — we deliberately don’t disturb the production-proven cache planner. OpenAI gpt-oss runs stable on direct InvokeModel and Converse coverage for those IDs varies by region. Set BEDROCK_PROVIDER=converse if you want to experiment with everything on Converse.
No hardcoded allowlist. /switch --model lists every text-output model with on-demand inference your account has access to — Kimi K2.6, GLM 4.7, Qwen3 Coder Next, Nemotron Nano 3, anything new AWS adds — without a release on our side. If a rare ID doesn’t fit Converse, ChatCLI returns a friendly error pointing the way.

Inference Profiles vs. Model IDs

This is the most important detail when using Claude on Bedrock. Modern Anthropic models (3.7, 4.x, 4.5, 4.6) do NOT accept direct on-demand invocation by base model ID. Attempting this returns:
on-demand throughput isn't supported. Request with the id or arn of an
inference profile that contains this model.
The fix is to use an inference profile ID — a logical ARN that routes the call to a region with available capacity. It carries a geography prefix:
PrefixMeaning
global.*Global — newest tier, worldwide availability (recommended)
us.*Cross-region USA (us-east-1, us-east-2, us-west-2)
eu.*Cross-region Europe
apac.*Cross-region Asia-Pacific
Example:
anthropic.claude-sonnet-4-5-20250929-v1:0           ❌ on-demand error
global.anthropic.claude-sonnet-4-5-20250929-v1:0    ✅ works
us.anthropic.claude-sonnet-4-5-20250929-v1:0        ✅ works
ChatCLI already uses a global inference profile as the default model (global.anthropic.claude-sonnet-4-5-20250929-v1:0). Claude 3 and 3.5 models still accept direct base-ID invocation and are also in the catalog.

Model Listing

/switch --model queries two live sources and merges them with the static catalog:
  1. bedrock:ListFoundationModels with ByOutputModality: TEXT — text-output models available in the region.
  2. bedrock:ListInferenceProfiles — regional/global profiles (paginated).
Two AWS-side filters guarantee that only invokable IDs appear:
  • Modality TEXT (server-side) — drops embedding-only and image-only models.
  • InferenceTypesSupported contains ON_DEMAND — drops base IDs that are only invokable via inference profile (Claude 3.7+/4.x and cross-region-only IDs from other providers). Those models still appear via ListInferenceProfiles with global./us./eu./apac. prefix.
/switch BEDROCK
/switch --model
Example output (depends on your account’s permissions):
Available models for BEDROCK (API: 47 + catalog: 14):
  1. global.anthropic.claude-sonnet-4-6-20260115-v1:0 ... [api]
  2. global.anthropic.claude-opus-4-6-20260115-v1:0 ... [api]
  3. global.anthropic.claude-sonnet-4-5-20250929-v1:0 ... [api]
  4. moonshotai.kimi-k2.5 ... [api]
  5. moonshotai.kimi-k2-thinking ... [api]
  6. zai.glm-4-7 ... [api]
  7. minimax.m-2-5 ... [api]
  8. qwen.qwen3-coder-480b ... [api]
  9. us.deepseek.r1-v1:0 ... [api]
  10. anthropic.claude-3-5-sonnet-20241022-v2:0 ... [api]
  11. openai.gpt-oss-120b-1:0 ... [api]
  ...
Models with [api] are the ones your account actually can invoke in that region. [catalog] entries are static registrations that may or may not be enabled.
You still need to enable Model Access in the Bedrock console for each provider you intend to use. AWS gates this per account + region. If a model appears in ListFoundationModels but throws AccessDeniedException on invoke, model access is missing — usually a one-click fix in the console.

Corporate Proxy and Private TLS

In corporate environments with a proxy intercepting TLS using a private CA, you may see:
tls: failed to verify certificate: x509: certificate signed by unknown authority
ChatCLI provides two Bedrock-specific env vars:
VariableDescription
CHATCLI_BEDROCK_CA_BUNDLEPath to a PEM bundle with the corporate CA. Merged into the system pool and used as RootCAs. Takes precedence over AWS_CA_BUNDLE.
CHATCLI_BEDROCK_INSECURE_SKIP_VERIFYtrue disables TLS verification entirely (equivalent to Node’s NODE_TLS_REJECT_UNAUTHORIZED=0). Insecure — use only to confirm a TLS issue.
# Recommended: use a bundle with the corporate CA
export CHATCLI_BEDROCK_CA_BUNDLE=/etc/ssl/corp-ca-bundle.pem

# Last resort (insecure)
export CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY=true
If the proxy intercepts TLS for all providers (not just Bedrock), prefer the global variables CHATCLI_CA_BUNDLE / CHATCLI_TLS_INSECURE_SKIP_VERIFY — they apply to every outbound connection (LLM providers, web tools, gateway, MCP), and Bedrock inherits them as fallback. The Bedrock-specific ones take precedence when both are set. See Global TLS Trust.
CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY=true logs a warning and accepts any certificate. Use only for troubleshooting — never in production.
HTTP(S) proxy is honored automatically through Go’s standard env vars:
export HTTPS_PROXY=http://proxy.corp:3128
export HTTP_PROXY=http://proxy.corp:3128
export NO_PROXY=localhost,127.0.0.1,.corp.internal

VPC endpoints / private endpoints

If your company uses a VPC endpoint for Bedrock:
export AWS_ENDPOINT_URL_BEDROCK_RUNTIME=https://bedrock-runtime.vpc.internal
export AWS_ENDPOINT_URL_BEDROCK=https://bedrock.vpc.internal
The SDK v2 reads these natively — no ChatCLI changes needed.

Environment Variables

VariableDescriptionDefault
BEDROCK_PROVIDERManual schema override: anthropic / claude, openai / gpt, converse / autoauto-detect
BEDROCK_TEMPERATURETemperature (used by OpenAI and Converse paths)
BEDROCK_TOP_PTop-p sampling (used by the Converse path)
BEDROCK_REGIONAWS region (takes precedence over AWS_REGION)
AWS_REGIONAWS region (fallback)
AWS_PROFILEProfile in ~/.aws/credentials or ~/.aws/config (SSO, assume-role). Can be set in .env.
AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKENStatic credentials
AWS_CA_BUNDLEPEM bundle read natively by SDK v2
AWS_ENDPOINT_URL_BEDROCK_RUNTIMEOverride for Bedrock Runtime endpoint
AWS_ENDPOINT_URL_BEDROCKOverride for Bedrock (control plane) endpoint
AWS_EC2_METADATA_DISABLEDtrue explicitly disables IMDS (169.254.169.254)
CHATCLI_BEDROCK_ENABLE_IMDS1/true forces IMDS probe on non-EC2 machinesfalse
BEDROCK_MAX_TOKENSOutput token limitFrom catalog
ANTHROPIC_MAX_TOKENSAlternative shared with direct Anthropic provider
CHATCLI_BEDROCK_CA_BUNDLEBedrock-specific PEM bundle (overrides AWS_CA_BUNDLE)
CHATCLI_BEDROCK_INSECURE_SKIP_VERIFYtrue disables TLS verification (insecure)false
HTTPS_PROXY / HTTP_PROXY / NO_PROXYStandard Go/SDK HTTP proxy
Default model: global.anthropic.claude-sonnet-4-5-20250929-v1:0 Default region: us-east-1 All these vars surface in /config providers (chat) and /config quality (embeddings). See Environment Variables for the full reference.

Observability — endpoint URL in logs

Bedrock now logs its endpoint URL on every request — parity with Anthropic, OpenAI, and Copilot. Useful for debugging credential / region / VPC endpoint / proxy issues. On init (once per session):
INFO  llm.info.configuring_provider: Bedrock
      region=us-east-1
      endpoint=https://bedrock-runtime.us-east-1.amazonaws.com
      model=global.anthropic.claude-sonnet-4-5-20250929-v1:0
On each request (chat):
INFO  llm: request start  provider=BEDROCK
      family=anthropic
      region=us-east-1
      endpoint=https://bedrock-runtime.us-east-1.amazonaws.com
      payload_bytes=12453  history_len=8  max_tokens=4096
      cache_markers=3
On embeddings init:
INFO  bedrock embeddings: configured
      region=us-east-1
      endpoint=https://bedrock-runtime.us-east-1.amazonaws.com
      model=amazon.titan-embed-text-v2:0
      family=titan
      dim=1024
The URL is derived from the SDK-resolved region (https://bedrock-runtime.<region>.amazonaws.com). If you set AWS_ENDPOINT_URL_BEDROCK_RUNTIME (VPC endpoint), the SDK uses your override — the log shows the canonical URL but the actual request goes to your custom endpoint.

Architecture

The bedrockruntime.Client construction lives in an exported helper (bedrock.LoadBedrockRuntime) shared between the chat client and the embeddings provider — single source of truth for AWS config. Authentication is SigV4, handled transparently by the SDK. The HTTP client can be overridden by ChatCLI when CHATCLI_BEDROCK_CA_BUNDLE or CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY is set (via awshttp.BuildableClient).

Bedrock vs. Direct Anthropic

AspectBEDROCKCLAUDEAI (direct Anthropic)
AuthAWS credentials chain (IAM, profile)API key (sk-ant-...) or OAuth
Endpointbedrock-runtime.<region>.amazonaws.comapi.anthropic.com
BillingAWS account (Billing console + CloudTrail)Anthropic account (console.anthropic.com)
ModelsFull Bedrock catalog (Claude, OpenAI, Llama, Nova, Mistral, Cohere, AI21, DeepSeek, Moonshot, MiniMax, Qwen, Z.AI, Gemma, Nemotron, TwelveLabs)All Claude, latest versions first
StreamingNot implemented in this version (uses InvokeModel / Converse)Supported
OAuth/1M contextN/ASupported (ANTHROPIC_1MTOKENS_SONNET)
Private VPCYes (via AWS_ENDPOINT_URL_*)No
ComplianceInherits from AWS (SOC2, HIPAA, etc.)Inherits from Anthropic
EmbeddingsYes — Titan v1/v2 + Cohere v3 (same AWS chain)Not available
If your company already runs everything on AWS with managed compliance, BEDROCK is the way. If you’re an individual developer wanting the newest Claude features (1M context, OAuth via Claude Code plan), use CLAUDEAI direct.

Troubleshooting

ChatCLI message when you select a base ID that requires an inference profile (Claude 3.7+/4.x/4.5/4.6/4.7 and equivalents from other providers). The message already suggests the fix:
bedrock: model "anthropic.claude-3-7-sonnet-20250219-v1:0" requires an
inference profile (the bare foundation ID is not invokable on-demand).
Try selecting "global.anthropic.claude-3-7-sonnet-20250219-v1:0",
"us.anthropic.claude-3-7-sonnet-20250219-v1:0",
"eu.anthropic.claude-3-7-sonnet-20250219-v1:0", or
"apac.anthropic.claude-3-7-sonnet-20250219-v1:0" instead, or run
`/switch --model` to see profiles your account has access to.
/switch --model automatically filters out base IDs that require profiles, so this only shows up if you typed an ID manually. The filter uses the InferenceTypesSupported field of ListFoundationModels: a model without ON_DEMAND is suppressed from the listing.
Go to the Bedrock console for that region and enable Model Access for the provider. Takes a few minutes. Also check the IAM role has bedrock:InvokeModel on the model ARN + the inference profile ARN.
The SDK didn’t find credentials. Check:
aws sts get-caller-identity   # should return your identity
env | grep -E 'AWS_|BEDROCK_'
ls -la ~/.aws/
If nothing returns credentials, set them up via aws configure, aws sso login, or export env vars.
This error occurs when the AWS SDK tries to reach the EC2 Instance Metadata Service (IMDS) on a machine that is not EC2 (e.g., your laptop). ChatCLI disables the IMDS probe by default on non-EC2, but if the error persists:
# Solution 1: Set valid credentials
export AWS_PROFILE=my-profile
# or
export AWS_ACCESS_KEY_ID=AKIA...

# Solution 2: Explicitly disable IMDS
export AWS_EC2_METADATA_DISABLED=true
If you are actually on EC2 and need IMDS:
export CHATCLI_BEDROCK_ENABLE_IMDS=1
Your SSO token has expired (default validity ~8h). Log in again:
aws sso login --profile your-profile
Remember to have AWS_PROFILE set (env, .env, or name your profile default).
Corporate proxy doing TLS interception. Configure CHATCLI_BEDROCK_CA_BUNDLE with the PEM of the corp CA. For quick troubleshooting, set CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY=true (insecure, temporary only).
You’ve hit on-demand quota for that region. Options:
  • Use a global.* inference profile (routes to any available region)
  • Use Provisioned Throughput (configure in the Bedrock console)
  • Raise quotas via AWS Service Quotas

Embeddings via Bedrock

ChatCLI also uses Bedrock as an embeddings provider (HyDE phase 3b, vector retrieval). Activation:
export CHATCLI_EMBED_PROVIDER=bedrock
export CHATCLI_QUALITY_HYDE_ENABLED=true
export CHATCLI_QUALITY_HYDE_USE_VECTORS=true

# Optional — defaults: Titan v2 1024-dim, same region as chat
export CHATCLI_EMBED_MODEL=amazon.titan-embed-text-v2:0
export CHATCLI_EMBED_DIMENSIONS=1024     # Titan v2: 256/512/1024
Supported families:
Model ID prefixFamilyDimensions
amazon.titan-embed-text-v2*Titan v2256 / 512 / 1024 (configurable via CHATCLI_EMBED_DIMENSIONS)
amazon.titan-embed-text-v1Titan v11536 (fixed)
cohere.embed-english-v3 / cohere.embed-multilingual-v3Cohere v31024 (fixed)
Reuses the same credentials chain as the chat client — BEDROCK_REGION / AWS_REGION / AWS_PROFILE / ~/.aws/credentials etc. See RAG + HyDE for the retrieval architecture.
Titan and Cohere use different schemas but ChatCLI auto-detects from the model id prefix. If you need a large batch with Titan (which only accepts 1 text per call), the provider parallelizes with an 8-worker pool transparently.

Next Steps

Provider Fallback

Configure automatic failover between Bedrock and other providers

RAG + HyDE

Embeddings via Bedrock Titan/Cohere for semantic retrieval

Supported Models

Full list of models per provider

Environment Variables

Complete configuration reference