diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index 89714c23f..35ca775f3 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -155,14 +155,17 @@ file, OS keyring backend, environment variable, winning source, and last-four label without printing the key itself. The command only probes the active provider's keyring entry. -For hosted, generic OpenAI-compatible, or self-hosted providers, set -`provider = "nvidia-nim"`, `"openai"`, `"atlascloud"`, `"wanjie-ark"`, -`"volcengine"`, `"openrouter"`, `"xiaomi-mimo"`, `"novita"`, `"fireworks"`, -`"siliconflow"`, `"siliconflow-CN"`, `"arcee"`, `"moonshot"`, `"sglang"`, -`"vllm"`, or `"ollama"` or pass -`codewhale --provider `. -For the provider-by-provider registry, including auth variables, default base -URLs, model IDs, and capability metadata, see [PROVIDERS.md](PROVIDERS.md). +For hosted, generic OpenAI-compatible, self-hosted, OpenAI Responses, or native +Anthropic providers, set `provider = ""` or pass +`codewhale --provider `. The canonical provider IDs are `deepseek`, +`nvidia-nim`, `openai`, `atlascloud`, `wanjie-ark`, `volcengine`, +`openrouter`, `xiaomi-mimo`, `novita`, `fireworks`, `siliconflow`, `arcee`, +`siliconflow-CN`, `moonshot`, `sglang`, `vllm`, `ollama`, `huggingface`, +`together`, `qianfan`, `openai-codex`, `anthropic`, `zai`, `stepfun`, +`minimax`, and `deepinfra`. +For the provider-by-provider registry, including wire protocol, auth variables, +default base URLs, model IDs, and capability metadata, see +[PROVIDERS.md](PROVIDERS.md). The facade saves provider credentials to the shared user config and forwards the resolved key, base URL, provider, and model to the TUI process. Use `codewhale auth set --provider nvidia-nim --api-key "YOUR_NVIDIA_API_KEY"` or @@ -172,8 +175,9 @@ the resolved key, base URL, provider, and model to the TUI process. Use `codewhale auth set --provider xiaomi-mimo --api-key "YOUR_XIAOMI_KEY"` or `codewhale auth set --provider fireworks --api-key "YOUR_FIREWORKS_API_KEY"` or `codewhale auth set --provider siliconflow --api-key "YOUR_SILICONFLOW_API_KEY"` or -`codewhale auth set --provider arcee --api-key "YOUR_ARCEE_API_KEY"` -to save provider keys through the facade. The generic `openai` provider defaults +`codewhale auth set --provider arcee --api-key "YOUR_ARCEE_API_KEY"` or the +matching provider ID from [PROVIDERS.md](PROVIDERS.md) to save provider keys +through the facade. The generic `openai` provider defaults to `https://api.openai.com/v1`, accepts `OPENAI_BASE_URL`, and defaults to `deepseek-v4-pro` for OpenAI-compatible gateways. `atlascloud` defaults to `https://api.atlascloud.ai/v1`, accepts `ATLASCLOUD_BASE_URL`, and uses @@ -191,8 +195,8 @@ when a local server does require bearer auth. SiliconFlow defaults to `https://api.siliconflow.com/v1`, accepts `SILICONFLOW_BASE_URL`, and uses `deepseek-ai/DeepSeek-V4-Pro` by default. `provider = "siliconflow-CN"` selects the China regional default -`https://api.siliconflow.cn/v1` while sharing the same -`[providers.siliconflow]` table and `SILICONFLOW_API_KEY` credential slot. +`https://api.siliconflow.cn/v1` with the `[providers.siliconflow_cn]` table and +`SILICONFLOW_API_KEY` credential slot. Arcee AI defaults to `https://api.arcee.ai/api/v1`, accepts `ARCEE_BASE_URL`, and uses `trinity-large-thinking` by default for CodeWhale agent work. `trinity-large-preview` is also listed as a direct Arcee API model; OpenRouter's @@ -237,6 +241,9 @@ model = "qwen-plus" Use the regional DashScope `compatible-mode/v1` base URL that matches the region of your API key. CodeWhale keeps `qwen-plus` scoped to the `openai` provider route and does not infer a different provider from the model prefix. +The same rule applies to all provider-prefixed model strings: a prefix such as +`deepseek-ai/...` or `deepseek/...` is a provider-owned wire ID under the +selected provider, not an automatic switch to the DeepSeek provider. If the gateway accepts `POST /chat/completions` but rejects `/v1/chat/completions`, set a provider-local `path_suffix`: @@ -446,7 +453,7 @@ aliases. When both forms are set the `CODEWHALE_*` value wins; the `DEEPSEEK_*` form is kept for older shells: - `CODEWHALE_PROVIDER` (preferred) / `DEEPSEEK_PROVIDER` (legacy alias) — - `deepseek|nvidia-nim|openai|atlascloud|wanjie-ark|volcengine|openrouter|xiaomi-mimo|novita|fireworks|siliconflow|siliconflow-CN|arcee|moonshot|sglang|vllm|ollama` + `deepseek|nvidia-nim|openai|atlascloud|wanjie-ark|volcengine|openrouter|xiaomi-mimo|novita|fireworks|siliconflow|arcee|siliconflow-CN|moonshot|sglang|vllm|ollama|huggingface|together|qianfan|openai-codex|anthropic|zai|stepfun|minimax|deepinfra` - `CODEWHALE_MODEL` (preferred) / `DEEPSEEK_MODEL` (legacy alias) — default model for the active provider - `CODEWHALE_BASE_URL` (preferred) / `DEEPSEEK_BASE_URL` (legacy alias) — base URL for the active provider @@ -474,14 +481,17 @@ Remaining variables: - `VOLCENGINE_MODEL` or `VOLCENGINE_ARK_MODEL` - `OPENROUTER_API_KEY` - `OPENROUTER_BASE_URL` +- `OPENROUTER_MODEL` - `XIAOMI_MIMO_TOKEN_PLAN_API_KEY`, `MIMO_TOKEN_PLAN_API_KEY`, `XIAOMI_MIMO_API_KEY`, `XIAOMI_API_KEY`, or `MIMO_API_KEY` - `XIAOMI_MIMO_BASE_URL` or `MIMO_BASE_URL` - `XIAOMI_MIMO_MODEL` or `MIMO_MODEL` - `XIAOMI_MIMO_MODE` or `MIMO_MODE` - `NOVITA_API_KEY` - `NOVITA_BASE_URL` +- `NOVITA_MODEL` - `FIREWORKS_API_KEY` - `FIREWORKS_BASE_URL` +- `FIREWORKS_MODEL` - `HUGGINGFACE_API_KEY` or `HF_TOKEN` (`HF_TOKEN` is a fallback alias accepted when provider is `huggingface`) - `HUGGINGFACE_BASE_URL` or `HF_BASE_URL` - `HUGGINGFACE_MODEL` or `HF_MODEL` @@ -491,9 +501,31 @@ Remaining variables: - `ARCEE_API_KEY` - `ARCEE_BASE_URL` - `ARCEE_MODEL` +- `TOGETHER_API_KEY` +- `TOGETHER_BASE_URL` +- `TOGETHER_MODEL` +- `QIANFAN_API_KEY` or `BAIDU_QIANFAN_API_KEY` +- `QIANFAN_BASE_URL` or `BAIDU_QIANFAN_BASE_URL` +- `QIANFAN_MODEL` or `BAIDU_QIANFAN_MODEL` +- `OPENAI_CODEX_ACCESS_TOKEN` or `CODEX_ACCESS_TOKEN` +- `OPENAI_CODEX_BASE_URL` or `CODEX_BASE_URL` +- `OPENAI_CODEX_MODEL` or `CODEX_MODEL` +- `OPENAI_CODEX_ACCOUNT_ID` or `CODEX_ACCOUNT_ID` - `ANTHROPIC_API_KEY` - `ANTHROPIC_BASE_URL` - `ANTHROPIC_MODEL` +- `ZAI_API_KEY` or `Z_AI_API_KEY` +- `ZAI_BASE_URL` or `Z_AI_BASE_URL` +- `ZAI_MODEL` or `Z_AI_MODEL` +- `STEPFUN_API_KEY` or `STEP_API_KEY` +- `STEPFUN_BASE_URL` or `STEP_BASE_URL` +- `STEPFUN_MODEL` or `STEP_MODEL` +- `MINIMAX_API_KEY` +- `MINIMAX_BASE_URL` +- `MINIMAX_MODEL` +- `DEEPINFRA_API_KEY` or `DEEPINFRA_TOKEN` +- `DEEPINFRA_BASE_URL` +- `DEEPINFRA_MODEL` - `MOONSHOT_API_KEY` or `KIMI_API_KEY` - `MOONSHOT_BASE_URL` or `KIMI_BASE_URL` - `MOONSHOT_MODEL`, `KIMI_MODEL_NAME`, or `KIMI_MODEL` @@ -982,14 +1014,14 @@ If you are upgrading from older releases: ### Core keys (used by the TUI/engine) -- `provider` (string, optional): `deepseek` (default), `nvidia-nim`, `openai`, `atlascloud`, `wanjie-ark`, `volcengine`, `openrouter`, `xiaomi-mimo`, `novita`, `fireworks`, `siliconflow`, `siliconflow-CN`, `arcee`, `moonshot`, `minimax`, `zai`, `stepfun`, `deepinfra`, `huggingface`, `together`, `qianfan`, `openai-codex`, `anthropic`, `sglang`, `vllm`, or `ollama`. Legacy `deepseek-cn` configs are still accepted as an alias for `deepseek`; DeepSeek uses the same official host [`https://api.deepseek.com`](https://api-docs.deepseek.com/) worldwide. `nvidia-nim` targets NVIDIA's NIM-hosted DeepSeek endpoints through `https://integrate.api.nvidia.com/v1`; `openai` targets a generic OpenAI-compatible endpoint, defaulting to `https://api.openai.com/v1`; `atlascloud` targets AtlasCloud's OpenAI-compatible endpoint at `https://api.atlascloud.ai/v1`; `wanjie-ark` targets Wanjie Ark's OpenAI-compatible endpoint at `https://maas-openapi.wanjiedata.com/api/v1`; `volcengine` targets Volcengine Ark's OpenAI-compatible coding endpoint at `https://ark.cn-beijing.volces.com/api/coding/v3`; `openrouter` targets `https://openrouter.ai/api/v1`; `xiaomi-mimo` targets Xiaomi MiMo's OpenAI-compatible endpoint, using `https://token-plan-sgp.xiaomimimo.com/v1` by default for Token Plan keys (`tp-...`) and `https://api.xiaomimimo.com/v1` for pay-as-you-go keys; set `base_url` explicitly if your Token Plan account uses the China region; `novita` targets `https://api.novita.ai/openai/v1`; `fireworks` targets `https://api.fireworks.ai/inference/v1`; `siliconflow` targets SiliconFlow, defaulting to `https://api.siliconflow.com/v1`; `siliconflow-CN` targets the SiliconFlow China regional endpoint while sharing `[providers.siliconflow]`; `arcee` targets Arcee AI's OpenAI-compatible endpoint at `https://api.arcee.ai/api/v1`; `moonshot` targets Moonshot/Kimi, defaulting to `https://api.moonshot.ai/v1`; `minimax` targets MiniMax at `https://api.minimax.io/v1`; `zai` targets Z.ai at `https://api.z.ai/api/coding/paas/v4`; `stepfun` targets StepFun at `https://api.stepfun.ai/v1`; `deepinfra` targets DeepInfra at `https://api.deepinfra.com/v1/openai`; `huggingface` targets Hugging Face Inference Providers at `https://router.huggingface.co/v1`; `together` targets Together AI at `https://api.together.xyz/v1`; `qianfan` targets Baidu Qianfan at `https://api.baiduqianfan.ai/v1`; `openai-codex` targets ChatGPT/Codex OAuth; `anthropic` targets Claude's native Messages API; `sglang` targets a self-hosted OpenAI-compatible endpoint, defaulting to `http://localhost:30000/v1`; `vllm` targets a self-hosted vLLM OpenAI-compatible endpoint, defaulting to `http://localhost:8000/v1`; `ollama` targets Ollama's OpenAI-compatible endpoint, defaulting to `http://localhost:11434/v1`. +- `provider` (string, optional): `deepseek` (default), `nvidia-nim`, `openai`, `atlascloud`, `wanjie-ark`, `volcengine`, `openrouter`, `xiaomi-mimo`, `novita`, `fireworks`, `siliconflow`, `arcee`, `siliconflow-CN`, `moonshot`, `sglang`, `vllm`, `ollama`, `huggingface`, `together`, `qianfan`, `openai-codex`, `anthropic`, `zai`, `stepfun`, `minimax`, or `deepinfra`. Legacy `deepseek-cn` configs are still accepted as an alias for `deepseek`; DeepSeek uses the same official host [`https://api.deepseek.com`](https://api-docs.deepseek.com/) worldwide. `nvidia-nim` targets NVIDIA's NIM-hosted DeepSeek endpoints through `https://integrate.api.nvidia.com/v1`; `openai` targets a generic OpenAI-compatible endpoint, defaulting to `https://api.openai.com/v1`; `atlascloud` targets AtlasCloud's OpenAI-compatible endpoint at `https://api.atlascloud.ai/v1`; `wanjie-ark` targets Wanjie Ark's OpenAI-compatible endpoint at `https://maas-openapi.wanjiedata.com/api/v1`; `volcengine` targets Volcengine Ark's OpenAI-compatible coding endpoint at `https://ark.cn-beijing.volces.com/api/coding/v3`; `openrouter` targets `https://openrouter.ai/api/v1`; `xiaomi-mimo` targets Xiaomi MiMo's OpenAI-compatible endpoint, using `https://token-plan-sgp.xiaomimimo.com/v1` by default for Token Plan keys (`tp-...`) and `https://api.xiaomimimo.com/v1` for pay-as-you-go keys; set `base_url` explicitly if your Token Plan account uses the China region; `novita` targets `https://api.novita.ai/openai/v1`; `fireworks` targets `https://api.fireworks.ai/inference/v1`; `siliconflow` targets SiliconFlow, defaulting to `https://api.siliconflow.com/v1`; `arcee` targets Arcee AI's OpenAI-compatible endpoint at `https://api.arcee.ai/api/v1`; `siliconflow-CN` targets the SiliconFlow China regional endpoint through `[providers.siliconflow_cn]`; `moonshot` targets Moonshot/Kimi, defaulting to `https://api.moonshot.ai/v1`; `sglang` targets a self-hosted OpenAI-compatible endpoint, defaulting to `http://localhost:30000/v1`; `vllm` targets a self-hosted vLLM OpenAI-compatible endpoint, defaulting to `http://localhost:8000/v1`; `ollama` targets Ollama's OpenAI-compatible endpoint, defaulting to `http://localhost:11434/v1`; `huggingface` targets Hugging Face Inference Providers at `https://router.huggingface.co/v1`; `together` targets Together AI at `https://api.together.xyz/v1`; `qianfan` targets Baidu Qianfan at `https://api.baiduqianfan.ai/v1`; `openai-codex` targets ChatGPT/Codex OAuth; `anthropic` targets Claude's native Messages API; `zai` targets Z.ai at `https://api.z.ai/api/coding/paas/v4`; `stepfun` targets StepFun at `https://api.stepfun.ai/v1`; `minimax` targets MiniMax at `https://api.minimax.io/v1`; `deepinfra` targets DeepInfra at `https://api.deepinfra.com/v1/openai`. - `api_key` (string, required for hosted providers): must be non-empty for DeepSeek/hosted providers (or set the provider API key env var). Self-hosted SGLang, vLLM, and Ollama can omit it. - `base_url` (string, optional): defaults to `https://api.deepseek.com/beta` for DeepSeek's OpenAI-compatible Chat Completions API, including legacy `provider = "deepseek-cn"` configs. Other defaults are `https://integrate.api.nvidia.com/v1` for `nvidia-nim`, `https://api.openai.com/v1` for `openai`, `https://api.atlascloud.ai/v1` for `atlascloud`, `https://maas-openapi.wanjiedata.com/api/v1` for `wanjie-ark`, `https://ark.cn-beijing.volces.com/api/coding/v3` for `volcengine`, `https://openrouter.ai/api/v1` for `openrouter`, `https://token-plan-sgp.xiaomimimo.com/v1` for `xiaomi-mimo` when the API key starts with `tp-...` and `https://api.xiaomimimo.com/v1` otherwise, `https://api.novita.ai/openai/v1` for `novita`, `https://api.fireworks.ai/inference/v1` for `fireworks`, `https://api.siliconflow.com/v1` for `siliconflow`, `https://api.siliconflow.cn/v1` for `siliconflow-CN`, `https://api.arcee.ai/api/v1` for `arcee`, `https://api.moonshot.ai/v1` for `moonshot`, `https://api.minimax.io/v1` for `minimax`, `https://api.z.ai/api/coding/paas/v4` for `zai`, `https://api.stepfun.ai/v1` for `stepfun`, `https://api.deepinfra.com/v1/openai` for `deepinfra`, `https://router.huggingface.co/v1` for `huggingface`, `https://api.together.xyz/v1` for `together`, `https://api.baiduqianfan.ai/v1` for `qianfan`, `https://chatgpt.com/backend-api` for `openai-codex`, `https://api.anthropic.com` for `anthropic`, `http://localhost:30000/v1` for `sglang`, `http://localhost:8000/v1` for `vllm`, and `http://localhost:11434/v1` for `ollama`. Set `base_url = "https://token-plan-cn.xiaomimimo.com/v1"` explicitly if your Xiaomi MiMo Token Plan account is provisioned in the China region. Set `https://api.deepseek.com` or `https://api.deepseek.com/v1` explicitly to opt out of DeepSeek beta features. - `path_suffix` (string, optional provider-table key): override the chat-completions path for OpenAI-compatible gateways that do not serve `/v1/chat/completions`. For example, `[providers.openai] path_suffix = "/chat/completions"` sends chat requests to the unversioned base URL plus `/chat/completions`; `models` and `beta/*` requests keep their normal routing. - `reasoning_stream_style` (string, optional provider-table key): override how streaming reasoning is separated from answer text for the active provider route. Use `separate_field` for `reasoning_content` / `reasoning` deltas, `inline_tags` for gateways that stream `...` inside `delta.content`, or `none` to render incoming content exactly as answer text. - `[providers..auth]` (table, optional): provider-scoped auth source metadata. `source = "command"` stores a command argv plus optional `timeout_ms`; `source = "secret"` stores a `secret_id`. This slice lets provider readiness, `/provider`, and doctor JSON report the auth source class without exposing command argv output or secret values; executing commands and resolving external secret material is handled by the follow-up resolver work. - `insecure_skip_tls_verify` (bool, optional provider-table key): legacy compatibility key, disabled by default. When true on the active provider table, provider clients reject the configuration instead of skipping TLS certificate verification. Use `SSL_CERT_FILE` for corporate or private CA bundles; `codewhale doctor` reports stale uses of this setting. -- `default_text_model` (string, optional): defaults to `deepseek-v4-pro` for DeepSeek and generic OpenAI-compatible endpoints, `deepseek-ai/deepseek-v4-pro` for NVIDIA NIM, `deepseek-ai/deepseek-v4-flash` for AtlasCloud, `deepseek-reasoner` for Wanjie Ark, `DeepSeek-V4-Pro` for Volcengine Ark, `deepseek/deepseek-v4-pro` for OpenRouter and Novita, `mimo-v2.5-pro` for Xiaomi MiMo, `accounts/fireworks/models/deepseek-v4-pro` for Fireworks, `deepseek-ai/DeepSeek-V4-Pro` for SiliconFlow and DeepInfra, `trinity-large-thinking` for Arcee AI, `kimi-k2.7-code` for Moonshot, `MiniMax-M3` for MiniMax, `GLM-5.2` for Z.ai, `step-3.7-flash` for StepFun, `ernie-4.0-turbo-8k` for Qianfan, `deepseek-ai/DeepSeek-V4-Pro` for SGLang/vLLM, and `deepseek-coder:1.3b` for Ollama. Hugging Face and Together AI both default to `deepseek-ai/DeepSeek-V4-Pro`. Current public DeepSeek IDs are `deepseek-v4-pro` and `deepseek-v4-flash`, both with 1M context windows, 384K max output, and thinking mode enabled by default. Legacy `deepseek-chat` and `deepseek-reasoner` remain compatibility aliases for `deepseek-v4-flash` until July 24, 2026, except SiliconFlow maps `deepseek-reasoner` and `deepseek-r1` to its Pro model while `deepseek-chat` and `deepseek-v3` map to Flash. Provider-specific mappings translate `deepseek-v4-pro` / `deepseek-v4-flash` to each provider's model ID where supported. OpenRouter also recognizes recent large IDs such as `arcee-ai/trinity-large-thinking`, `minimax/minimax-m3`, `minimax/minimax-2.7`, `xiaomi/mimo-v2.5-pro`, `qwen/qwen3.6-flash`, `qwen/qwen3.6-35b-a3b`, `qwen/qwen3.6-max-preview`, `qwen/qwen3.6-27b`, `qwen/qwen3.6-plus`, `qwen/qwen3.7-max`, `google/gemma-4-31b-it`, `moonshotai/kimi-k2.7-code`, `moonshotai/kimi-k2.6`, `nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free`, and `nvidia/nemotron-3-ultra-550b-a55b`; direct Arcee uses bare IDs such as `trinity-large-thinking` and `trinity-large-preview`; direct Moonshot recognizes `kimi-k2.7-code`, `kimi-k2.6`, and Kimi Code's stable `kimi-for-coding`; direct MiniMax recognizes `MiniMax-M3` and the documented M2.x chat model IDs; direct Xiaomi MiMo recognizes chat IDs `mimo-v2.5-pro` and `mimo-v2.5`, while TTS IDs are selected through `codewhale speech` / `tts`. Generic `openai`, `atlascloud`, `wanjie-ark`, `xiaomi-mimo`, `arcee`, `moonshot`, `minimax`, `zai`, `stepfun`, `qianfan`, and Ollama model IDs are passed through unchanged after known aliases are normalized. OpenRouter and SiliconFlow provider configs with a custom `base_url` also preserve explicit model values, which lets OpenAI-compatible gateways accept bare model IDs. Use `/models` or `codewhale models` to discover live IDs from your configured endpoint. `CODEWHALE_MODEL` overrides this for a single process; `DEEPSEEK_MODEL` is the legacy alias. +- `default_text_model` (string, optional): defaults to `deepseek-v4-pro` for DeepSeek and generic OpenAI-compatible endpoints, `deepseek-ai/deepseek-v4-pro` for NVIDIA NIM, `deepseek-ai/deepseek-v4-flash` for AtlasCloud, `deepseek-reasoner` for Wanjie Ark, `DeepSeek-V4-Pro` for Volcengine Ark, `deepseek/deepseek-v4-pro` for OpenRouter and Novita, `mimo-v2.5-pro` for Xiaomi MiMo, `accounts/fireworks/models/deepseek-v4-pro` for Fireworks, `deepseek-ai/DeepSeek-V4-Pro` for SiliconFlow and DeepInfra, `trinity-large-thinking` for Arcee AI, `kimi-k2.7-code` for Moonshot, `MiniMax-M3` for MiniMax, `GLM-5.2` for Z.ai, `step-3.7-flash` for StepFun, `ernie-4.0-turbo-8k` for Qianfan, `deepseek-ai/DeepSeek-V4-Pro` for SGLang/vLLM, and `deepseek-coder:1.3b` for Ollama. Hugging Face and Together AI both default to `deepseek-ai/DeepSeek-V4-Pro`; `openai-codex` defaults to `gpt-5.5`; `anthropic` defaults to `claude-sonnet-4-6`. Current public DeepSeek IDs are `deepseek-v4-pro` and `deepseek-v4-flash`, both with 1M context windows, 384K max output, and thinking mode enabled by default. Legacy `deepseek-chat` and `deepseek-reasoner` remain compatibility aliases for `deepseek-v4-flash` until July 24, 2026, except SiliconFlow maps `deepseek-reasoner` and `deepseek-r1` to its Pro model while `deepseek-chat` and `deepseek-v3` map to Flash. Provider-specific mappings translate `deepseek-v4-pro` / `deepseek-v4-flash` to each provider's model ID where supported. OpenRouter also recognizes recent large IDs such as `arcee-ai/trinity-large-thinking`, `minimax/minimax-m3`, `minimax/minimax-2.7`, `xiaomi/mimo-v2.5-pro`, `qwen/qwen3.6-flash`, `qwen/qwen3.6-35b-a3b`, `qwen/qwen3.6-max-preview`, `qwen/qwen3.6-27b`, `qwen/qwen3.6-plus`, `qwen/qwen3.7-max`, `google/gemma-4-31b-it`, `moonshotai/kimi-k2.7-code`, `moonshotai/kimi-k2.6`, `nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free`, and `nvidia/nemotron-3-ultra-550b-a55b`; direct Arcee uses bare IDs such as `trinity-large-thinking` and `trinity-large-preview`; direct Moonshot recognizes `kimi-k2.7-code`, `kimi-k2.6`, and Kimi Code's stable `kimi-for-coding`; direct MiniMax recognizes `MiniMax-M3` and the documented M2.x chat model IDs; direct Xiaomi MiMo recognizes chat IDs `mimo-v2.5-pro` and `mimo-v2.5`, while TTS IDs are selected through `codewhale speech` / `tts`. Generic `openai`, `atlascloud`, `wanjie-ark`, `xiaomi-mimo`, `arcee`, `moonshot`, `minimax`, `zai`, `stepfun`, `qianfan`, and Ollama model IDs are passed through unchanged after known aliases are normalized. OpenRouter and SiliconFlow provider configs with a custom `base_url` also preserve explicit model values, which lets OpenAI-compatible gateways accept bare model IDs. Use `/models` or `codewhale models` to discover live IDs from your configured endpoint. `CODEWHALE_MODEL` overrides this for a single process; `DEEPSEEK_MODEL` is the legacy alias. - `reasoning_effort` (string, optional): `off`, `low`, `medium`, `high`, `max`, `xhigh`, or `ultracode`; defaults to the configured UI tier. DeepSeek Platform receives top-level `thinking` / `reasoning_effort` fields. OpenAI Codex normalizes stale `off` to `low` and sends `max` / `ultracode` as Responses `xhigh`. Z.ai receives documented `thinking` controls and treats enabled thinking as the GLM coding high/max lane. NVIDIA NIM receives equivalent settings through `chat_template_kwargs`. - `verbosity` (string, optional): `normal` or `concise`. `normal` keeps the default conversational prompt. `concise` appends a prompt discipline block diff --git a/docs/PROVIDERS.md b/docs/PROVIDERS.md index f076a9063..e93b7f391 100644 --- a/docs/PROVIDERS.md +++ b/docs/PROVIDERS.md @@ -5,12 +5,11 @@ CodeWhale codebase. It is intentionally conservative: shipped entries are limited to provider IDs, config keys, auth paths, base URLs, model resolution, and capability metadata that the code already knows about. -DeepSeek remains the first-class default provider. NVIDIA NIM, OpenRouter, -Volcengine Ark, Xiaomi MiMo, Novita, Fireworks, SiliconFlow, Arcee AI, -DeepInfra, Together AI, Baidu Qianfan, Z.ai, StepFun, MiniMax, generic OpenAI-compatible -endpoints, self-hosted runtimes, Moonshot/Kimi, and Hugging Face Inference -Providers are additive routes for running the same terminal harness against -other hosted or local model endpoints. +DeepSeek remains the default provider, but every entry in `ProviderKind::ALL` +and `PROVIDER_REGISTRY` is a first-class selectable provider route. Hosted +routes, generic OpenAI-compatible endpoints, the OpenAI Codex/ChatGPT route, +native Anthropic, and local runtimes all run the same terminal harness against +the selected provider/model/base URL. Sources to keep in sync: @@ -30,10 +29,10 @@ Sources to keep in sync: The canonical provider IDs are: `deepseek`, `nvidia-nim`, `openai`, `atlascloud`, `wanjie-ark`, `volcengine`, -`openrouter`, `xiaomi-mimo`, `novita`, `fireworks`, `siliconflow`, -`siliconflow-CN`, `arcee`, `moonshot`, `zai`, `stepfun`, `minimax`, `sglang`, -`vllm`, `ollama`, `huggingface`, `deepinfra`, `together`, `qianfan`, `openai-codex`, and -`anthropic`. +`openrouter`, `xiaomi-mimo`, `novita`, `fireworks`, `siliconflow`, `arcee`, +`siliconflow-CN`, `moonshot`, `sglang`, `vllm`, `ollama`, `huggingface`, +`together`, `qianfan`, `openai-codex`, `anthropic`, `zai`, `stepfun`, +`minimax`, and `deepinfra`. Use any of these surfaces to select a provider: @@ -54,6 +53,57 @@ artifact export. Fresh shared config writes to `~/.codewhale/config.toml`. Existing `~/.deepseek/config.toml` files are still read for compatibility. +### Wire Protocol Compatibility + +Provider selection is explicit. A model string prefix such as +`deepseek-ai/...`, `deepseek/...`, `qwen/...`, or `arcee-ai/...` is a +provider-owned wire ID or catalog namespace hint under the selected provider. +It is not a provider switch and must not be treated as proof that the route is +DeepSeek, OpenRouter, or any other provider. + +Set the route with `provider = ""`, `CODEWHALE_PROVIDER=`, or +`codewhale --provider `. Set the request model with `CODEWHALE_MODEL`, a +provider-specific model env var, top-level `default_text_model`, or +`[providers.].model`. Set the endpoint with `CODEWHALE_BASE_URL`, a +provider-specific base URL env var, or `[providers.
].base_url`. Set auth +with `codewhale auth set --provider `, `[providers.
].api_key`, or +the listed provider env vars. + +| Provider ID | TOML table | Wire protocol | Auth env vars | +| --- | --- | --- | --- | +| `deepseek` | `[providers.deepseek]` | OpenAI Chat Completions | `DEEPSEEK_API_KEY` | +| `nvidia-nim` | `[providers.nvidia_nim]` | OpenAI Chat Completions | `NVIDIA_API_KEY`, `NVIDIA_NIM_API_KEY`, `DEEPSEEK_API_KEY` | +| `openai` | `[providers.openai]` | OpenAI Chat Completions | `OPENAI_API_KEY` | +| `atlascloud` | `[providers.atlascloud]` | OpenAI Chat Completions | `ATLASCLOUD_API_KEY` | +| `wanjie-ark` | `[providers.wanjie_ark]` | OpenAI Chat Completions | `WANJIE_ARK_API_KEY`, `WANJIE_API_KEY`, `WANJIE_MAAS_API_KEY` | +| `volcengine` | `[providers.volcengine]` | OpenAI Chat Completions | `VOLCENGINE_API_KEY`, `VOLCENGINE_ARK_API_KEY`, `ARK_API_KEY` | +| `openrouter` | `[providers.openrouter]` | OpenAI Chat Completions | `OPENROUTER_API_KEY` | +| `xiaomi-mimo` | `[providers.xiaomi_mimo]` | OpenAI Chat Completions | `XIAOMI_MIMO_TOKEN_PLAN_API_KEY`, `MIMO_TOKEN_PLAN_API_KEY`, `XIAOMI_MIMO_API_KEY`, `XIAOMI_API_KEY`, `MIMO_API_KEY` | +| `novita` | `[providers.novita]` | OpenAI Chat Completions | `NOVITA_API_KEY` | +| `fireworks` | `[providers.fireworks]` | OpenAI Chat Completions | `FIREWORKS_API_KEY` | +| `siliconflow` | `[providers.siliconflow]` | OpenAI Chat Completions | `SILICONFLOW_API_KEY` | +| `arcee` | `[providers.arcee]` | OpenAI Chat Completions | `ARCEE_API_KEY` | +| `siliconflow-CN` | `[providers.siliconflow_cn]` | OpenAI Chat Completions | `SILICONFLOW_API_KEY` | +| `moonshot` | `[providers.moonshot]` | OpenAI Chat Completions | `MOONSHOT_API_KEY`, `KIMI_API_KEY` | +| `sglang` | `[providers.sglang]` | OpenAI Chat Completions | `SGLANG_API_KEY` | +| `vllm` | `[providers.vllm]` | OpenAI Chat Completions | `VLLM_API_KEY` | +| `ollama` | `[providers.ollama]` | Ollama-local OpenAI-compatible Chat Completions | `OLLAMA_API_KEY` | +| `huggingface` | `[providers.huggingface]` | OpenAI Chat Completions | `HUGGINGFACE_API_KEY`, `HF_TOKEN` | +| `together` | `[providers.together]` | OpenAI Chat Completions | `TOGETHER_API_KEY` | +| `qianfan` | `[providers.qianfan]` | OpenAI Chat Completions | `QIANFAN_API_KEY`, `BAIDU_QIANFAN_API_KEY` | +| `openai-codex` | `[providers.openai_codex]` | OpenAI Responses | `OPENAI_CODEX_ACCESS_TOKEN`, `CODEX_ACCESS_TOKEN` | +| `anthropic` | `[providers.anthropic]` | Anthropic Messages | `ANTHROPIC_API_KEY` | +| `zai` | `[providers.zai]` | OpenAI Chat Completions | `ZAI_API_KEY`, `Z_AI_API_KEY` | +| `stepfun` | `[providers.stepfun]` | OpenAI Chat Completions | `STEPFUN_API_KEY`, `STEP_API_KEY` | +| `minimax` | `[providers.minimax]` | OpenAI Chat Completions | `MINIMAX_API_KEY` | +| `deepinfra` | `[providers.deepinfra]` | OpenAI Chat Completions | `DEEPINFRA_API_KEY`, `DEEPINFRA_TOKEN` | + +Default base URLs and models for each route are listed in the shipped provider +table below. The wire protocol values above are derived from +`crates/config/src/provider.rs`: `ChatCompletions` is the default, +`openai-codex` overrides to `Responses`, and `anthropic` overrides to +`AnthropicMessages`. + ## Auth And Env Rules For hosted providers, `codewhale auth set --provider ` saves an API key for