BEDROCK) with three dispatch paths that cover the entire AWS-hosted catalog:
- Anthropic Messages —
anthropic.*and inference profiles (global./us./eu./apac.anthropic.*). Preserves cache markers and extended-thinking budget. - OpenAI Chat Completions —
openai.gpt-oss-*(OpenAI’s open-weights on Bedrock). - Converse API (default) — AWS’s unified schema covering everything else: Llama, Amazon Nova, Mistral, Cohere, AI21 Jamba, DeepSeek, Stability, Writer Palmyra, Moonshot Kimi, MiniMax, Qwen, Z.AI/GLM, Google Gemma, NVIDIA Nemotron, TwelveLabs Pegasus, and any provider AWS onboards next.
/switch --model listing trusts AWS-side responses from ListFoundationModels + ListInferenceProfiles 100% — there is no hardcoded allowlist. A new model on AWS shows up on the next /switch --model without a ChatCLI release.
Ideal for corporate environments that already manage billing, compliance, and access control through AWS — no need for API keys from the original providers.
Why AWS Bedrock?
No per-provider API key
~/.aws/credentials, AWS_PROFILE). Single identity across every model.AWS billing and compliance
Full catalog
VPC endpoints
AWS_ENDPOINT_URL_BEDROCK_RUNTIME.Auto-detected family
Native embeddings
Configuration
The provider is auto-detected when ChatCLI finds valid AWS credentials (not just file existence):- Static creds in env:
AWS_ACCESS_KEY_ID - Profile selection:
AWS_PROFILE(via env var or.envfile) ~/.aws/credentialsfile with at least one non-emptyaws_access_key_id- AWS SSO: SSO profile in
~/.aws/config(detectssso_session,sso_start_url,sso_account_id) - Assume-role / credential_process: profiles with
role_arnorcredential_processin~/.aws/config - SSO token cache: presence of files in
~/.aws/sso/cache/(indicating a prioraws sso login) - Web Identity Token (EKS IRSA):
AWS_WEB_IDENTITY_TOKEN_FILE - Container Credentials (ECS):
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI/_FULL_URI
Option 1: ~/.aws/credentials (static credentials)
If you already use AWS CLI, just have a profile configured:
Option 2: AWS SSO (IAM Identity Center)
If your company uses AWS SSO, configure the profile in~/.aws/config:
~/.aws/config (via sso_session, sso_start_url, sso_account_id keys). If the SSO token expires, the error will be clear (SSOTokenProviderError) — just run aws sso login again.Important: the AWS SDK does not know which profile is “logged in”. You must indicate the profile via AWS_PROFILE (env, .env, or flag). If your SSO profile is named default, it is used automatically without AWS_PROFILE.Option 3: Environment variables (static credentials)
Option 4: IAM Role (EC2/ECS/EKS)
On AWS-native environments, nothing to configure — the SDK picks up the role automatically through IMDSv2 / webidentity. Just make sure the role has the IAM permissions below.AWS_CONTAINER_CREDENTIALS_*, AWS_WEB_IDENTITY_TOKEN_FILE, ECS_CONTAINER_METADATA_URI*).To force behavior, use:AWS_EC2_METADATA_DISABLED=true— explicitly disable IMDSCHATCLI_BEDROCK_ENABLE_IMDS=1— force enable IMDS (useful on EC2 without standard env vars)
IAM Permissions
Minimum permissions to invoke and list models. Thebedrock:InvokeModel action covers both InvokeModel (Anthropic/OpenAI) and Converse (everything else):
Bedrock Console → Model access → Request access.
Model families and schema selection
Bedrock uses different schemas depending on the model. ChatCLI has three dispatch paths and auto-detects which one to use from the model-id prefix:| Model id prefix | Family | Schema | Why |
|---|---|---|---|
anthropic.*, global./us./eu./apac.anthropic.* | Anthropic Messages | anthropic_version, messages, system (with cache_control) | Preserves cache breakpoints, extended-thinking budget, every Claude knob |
openai.*, us.openai.*, etc. | OpenAI Chat Completions | messages, max_completion_tokens | Stable schema with broad coverage for GPT-OSS |
| Anything else (Llama, Nova, Mistral, Cohere, AI21, DeepSeek, Stability, Writer, Moonshot, MiniMax, Qwen, Z.AI, Gemma, Nemotron, TwelveLabs, …) | Converse API (default) | messages, system, inferenceConfig | AWS-unified schema — one implementation covers every provider that doesn’t need special features |
Manual override
To force a family regardless of the prefix (e.g. test Converse on an Anthropic model), use the env var:anthropic / claude, openai / gpt, converse / auto (case-insensitive). The env var takes precedence over prefix detection.
/switch --model lists every text-output model with on-demand inference your account has access to — Kimi K2.6, GLM 4.7, Qwen3 Coder Next, Nemotron Nano 3, anything new AWS adds — without a release on our side. If a rare ID doesn’t fit Converse, ChatCLI returns a friendly error pointing the way.Inference Profiles vs. Model IDs
This is the most important detail when using Claude on Bedrock. Modern Anthropic models (3.7, 4.x, 4.5, 4.6) do NOT accept direct on-demand invocation by base model ID. Attempting this returns:| Prefix | Meaning |
|---|---|
global.* | Global — newest tier, worldwide availability (recommended) |
us.* | Cross-region USA (us-east-1, us-east-2, us-west-2) |
eu.* | Cross-region Europe |
apac.* | Cross-region Asia-Pacific |
global.anthropic.claude-sonnet-4-5-20250929-v1:0). Claude 3 and 3.5 models still accept direct base-ID invocation and are also in the catalog.Model Listing
/switch --model queries two live sources and merges them with the static catalog:
bedrock:ListFoundationModelswithByOutputModality: TEXT— text-output models available in the region.bedrock:ListInferenceProfiles— regional/global profiles (paginated).
- Modality TEXT (server-side) — drops embedding-only and image-only models.
InferenceTypesSupportedcontainsON_DEMAND— drops base IDs that are only invokable via inference profile (Claude 3.7+/4.x and cross-region-only IDs from other providers). Those models still appear viaListInferenceProfileswithglobal./us./eu./apac.prefix.
[api] are the ones your account actually can invoke in that region. [catalog] entries are static registrations that may or may not be enabled.
Corporate Proxy and Private TLS
In corporate environments with a proxy intercepting TLS using a private CA, you may see:| Variable | Description |
|---|---|
CHATCLI_BEDROCK_CA_BUNDLE | Path to a PEM bundle with the corporate CA. Merged into the system pool and used as RootCAs. Takes precedence over AWS_CA_BUNDLE. |
CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY | true disables TLS verification entirely (equivalent to Node’s NODE_TLS_REJECT_UNAUTHORIZED=0). Insecure — use only to confirm a TLS issue. |
CHATCLI_CA_BUNDLE / CHATCLI_TLS_INSECURE_SKIP_VERIFY — they apply to every outbound connection (LLM providers, web tools, gateway, MCP), and Bedrock inherits them as fallback. The Bedrock-specific ones take precedence when both are set. See Global TLS Trust.VPC endpoints / private endpoints
If your company uses a VPC endpoint for Bedrock:Environment Variables
| Variable | Description | Default |
|---|---|---|
BEDROCK_PROVIDER | Manual schema override: anthropic / claude, openai / gpt, converse / auto | auto-detect |
BEDROCK_TEMPERATURE | Temperature (used by OpenAI and Converse paths) | — |
BEDROCK_TOP_P | Top-p sampling (used by the Converse path) | — |
BEDROCK_REGION | AWS region (takes precedence over AWS_REGION) | — |
AWS_REGION | AWS region (fallback) | — |
AWS_PROFILE | Profile in ~/.aws/credentials or ~/.aws/config (SSO, assume-role). Can be set in .env. | — |
AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN | Static credentials | — |
AWS_CA_BUNDLE | PEM bundle read natively by SDK v2 | — |
AWS_ENDPOINT_URL_BEDROCK_RUNTIME | Override for Bedrock Runtime endpoint | — |
AWS_ENDPOINT_URL_BEDROCK | Override for Bedrock (control plane) endpoint | — |
AWS_EC2_METADATA_DISABLED | true explicitly disables IMDS (169.254.169.254) | — |
CHATCLI_BEDROCK_ENABLE_IMDS | 1/true forces IMDS probe on non-EC2 machines | false |
BEDROCK_MAX_TOKENS | Output token limit | From catalog |
ANTHROPIC_MAX_TOKENS | Alternative shared with direct Anthropic provider | — |
CHATCLI_BEDROCK_CA_BUNDLE | Bedrock-specific PEM bundle (overrides AWS_CA_BUNDLE) | — |
CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY | true disables TLS verification (insecure) | false |
HTTPS_PROXY / HTTP_PROXY / NO_PROXY | Standard Go/SDK HTTP proxy | — |
global.anthropic.claude-sonnet-4-5-20250929-v1:0
Default region: us-east-1
All these vars surface in /config providers (chat) and /config quality (embeddings). See Environment Variables for the full reference.
Observability — endpoint URL in logs
Bedrock now logs its endpoint URL on every request — parity with Anthropic, OpenAI, and Copilot. Useful for debugging credential / region / VPC endpoint / proxy issues. On init (once per session):https://bedrock-runtime.<region>.amazonaws.com). If you set AWS_ENDPOINT_URL_BEDROCK_RUNTIME (VPC endpoint), the SDK uses your override — the log shows the canonical URL but the actual request goes to your custom endpoint.
Architecture
Thebedrockruntime.Client construction lives in an exported helper (bedrock.LoadBedrockRuntime) shared between the chat client and the embeddings provider — single source of truth for AWS config. Authentication is SigV4, handled transparently by the SDK. The HTTP client can be overridden by ChatCLI when CHATCLI_BEDROCK_CA_BUNDLE or CHATCLI_BEDROCK_INSECURE_SKIP_VERIFY is set (via awshttp.BuildableClient).
Bedrock vs. Direct Anthropic
| Aspect | BEDROCK | CLAUDEAI (direct Anthropic) |
|---|---|---|
| Auth | AWS credentials chain (IAM, profile) | API key (sk-ant-...) or OAuth |
| Endpoint | bedrock-runtime.<region>.amazonaws.com | api.anthropic.com |
| Billing | AWS account (Billing console + CloudTrail) | Anthropic account (console.anthropic.com) |
| Models | Full Bedrock catalog (Claude, OpenAI, Llama, Nova, Mistral, Cohere, AI21, DeepSeek, Moonshot, MiniMax, Qwen, Z.AI, Gemma, Nemotron, TwelveLabs) | All Claude, latest versions first |
| Streaming | Not implemented in this version (uses InvokeModel / Converse) | Supported |
| OAuth/1M context | N/A | Supported (ANTHROPIC_1MTOKENS_SONNET) |
| Private VPC | Yes (via AWS_ENDPOINT_URL_*) | No |
| Compliance | Inherits from AWS (SOC2, HIPAA, etc.) | Inherits from Anthropic |
| Embeddings | Yes — Titan v1/v2 + Cohere v3 (same AWS chain) | Not available |
Troubleshooting
bedrock: model X requires an inference profile
bedrock: model X requires an inference profile
/switch --model automatically filters out base IDs that require profiles, so this only shows up if you typed an ID manually. The filter uses the InferenceTypesSupported field of ListFoundationModels: a model without ON_DEMAND is suppressed from the listing.AccessDeniedException: You don't have access to the model
AccessDeniedException: You don't have access to the model
bedrock:InvokeModel on the model ARN + the inference profile ARN.NoCredentialProviders / unable to load SDK config
NoCredentialProviders / unable to load SDK config
aws configure, aws sso login, or export env vars.no EC2 IMDS role found / dial tcp 169.254.169.254:80: connect: host is down
no EC2 IMDS role found / dial tcp 169.254.169.254:80: connect: host is down
SSOTokenProviderError / expired token (SSO)
SSOTokenProviderError / expired token (SSO)
AWS_PROFILE set (env, .env, or name your profile default).x509: certificate signed by unknown authority
x509: certificate signed by unknown authority
ThrottlingException / ServiceQuotaExceededException
ThrottlingException / ServiceQuotaExceededException
- Use a
global.*inference profile (routes to any available region) - Use Provisioned Throughput (configure in the Bedrock console)
- Raise quotas via AWS Service Quotas
Embeddings via Bedrock
ChatCLI also uses Bedrock as an embeddings provider (HyDE phase 3b, vector retrieval). Activation:| Model ID prefix | Family | Dimensions |
|---|---|---|
amazon.titan-embed-text-v2* | Titan v2 | 256 / 512 / 1024 (configurable via CHATCLI_EMBED_DIMENSIONS) |
amazon.titan-embed-text-v1 | Titan v1 | 1536 (fixed) |
cohere.embed-english-v3 / cohere.embed-multilingual-v3 | Cohere v3 | 1024 (fixed) |
BEDROCK_REGION / AWS_REGION / AWS_PROFILE / ~/.aws/credentials etc. See RAG + HyDE for the retrieval architecture.