All proxy endpoints (chat/messages/responses) optionally accept Authorization: Bearer {proxy_api_key}.
Dashboard UI uses cookie-based session (_codex_session).
OpenAI-compatible chat completion.
- Streaming: SSE with
choice.deltaevents - Non-streaming:
{ id, choices, usage } - Errors:
{ error: { message, type, code } } max_tokens,max_completion_tokens, andmax_output_tokensare accepted for client compatibility but are not forwarded to Codex.
Anthropic Messages API compatible.
// Request
{
"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 1024,
"stream": true,
"thinking": {"type": "enabled"} // optional
}- Auth:
x-api-keyorAuthorization: Bearer - Errors:
{ type: "error", error: { type, message } }
Google Gemini compatible.
// Request
{
"contents": [{"role": "user", "parts": [{"text": "Hello"}]}],
"generationConfig": {"temperature": 0.7, "maxOutputTokens": 1024},
"systemInstruction": {"parts": [{"text": "You are helpful."}]}
}- Auth:
x-goog-api-keyheader,keyquery param, or Bearer token - Errors:
{ error: { code, message, status } }
Native Codex Responses API passthrough (WebSocket transport).
// Request
{
"model": "o4-mini",
"instructions": "You are helpful.",
"input": [{"type": "message", "content": "Hello"}],
"stream": true,
"reasoning": {"effort": "medium"},
"tools": [],
"previous_response_id": "resp_xxx" // multi-turn
}- Streaming: SSE with
response.created,response.output_text.delta,response.completed - Non-streaming:
{ response, usage, responseId } - Do not send
max_output_tokensto native Codex. The proxy accepts it only for compatibility and strips it, because the real Codex backend rejects it with400 Unsupported parameter: max_output_tokens.
Declare {"type": "image_generation", ...} in tools[] to let the model invoke
the server-side image generation backend (gpt-image-2). Requires a ChatGPT
Plus or higher account — free plans have the tool silently stripped upstream
and the model falls back to returning SVG text.
Supported fields (all optional except type):
| Field | Enum / range | Default | Notes |
|---|---|---|---|
size |
1024x1024, 1024x1536, 1536x1024, 2048x2048, 2048x3072, 3072x2048, 3840x2160 (4K UHD), 2160x3840 (4K portrait), 2304x3072 (3:4), auto |
auto |
Width and height must both be divisible by 16. Longest edge ≤ 3840 px. Total pixel budget ≈ 8 MP (3072x3072 rejected). Resolutions below 1024 px also rejected (min pixel budget) |
output_format |
png / jpeg / webp |
png |
gif is rejected |
output_compression |
integer 0–100 | 100 |
jpeg / webp only — PNG rejects any non-100 |
background |
auto / opaque |
auto |
transparent is rejected for this model |
moderation |
auto / low |
auto |
other enums rejected |
partial_images |
integer 0–3 | 0 | >3 rejected |
Silently rewritten / hard-rejected fields:
model— whatever you send, upstream forcesgpt-image-2.quality— any value is echoed back asauto; the user-supplied value has no effect.n— rejected (unknown_parameter); one image per call.input_image,mask,input_fidelity,style,response_format— rejected.
Event stream order (when the model invokes the tool):
response.created— echoestools[]with upstream-normalized fields.response.output_item.added—{type: "image_generation_call", ...}.response.image_generation_call.in_progress→.generating→ (optional).partial_image× N.response.output_item.done— the completedimage_generation_callwith:result— base64-encoded image bytes (PNG / JPEG / WebP byoutput_format).revised_prompt— the final prompt the model actually used.
response.completed.
Token accounting: response.completed.response.usage reports the host
model's tokens; the image_generation tool's own tokens come back separately as
response.completed.response.tool_usage.image_gen.{input_tokens, output_tokens, total_tokens}.
The proxy passes both through verbatim, and tracks them as separate counters
on the dashboard (total_image_input_tokens / total_image_output_tokens) so
image-gen usage doesn't pollute host-model token charts.
Request accounting: the proxy also counts each image_generation request
as success or failure. total_image_request_count increments when the upstream
returned a real image (non-zero tool_usage.image_gen.output_tokens);
total_image_request_failed_count increments when the tool was silently
stripped (Free plan), the upstream returned an error, or the response came
back empty. Both surfaces in /admin/usage-stats/summary and the Dashboard's
"Image Requests" card.
Edit mode (supply a reference image): put an input_image block in the user
message content. data: URLs and HTTPS URLs both work.
{
"model": "gpt-5.5",
"stream": true,
"input": [{
"role": "user",
"content": [
{"type": "input_text", "text": "Make this sky a sunset."},
{"type": "input_image", "image_url": "data:image/png;base64,AAA...", "detail": "high"}
]
}],
"tools": [{"type": "image_generation", "size": "1024x1024"}]
}Legal content-part types (from upstream enum validation): input_text,
input_image, output_text, refusal, input_file, computer_screenshot,
summary_text.
OpenAI Chat compatibility accepts tools: [{"type":"image_generation"}], but
the stable image payload is exposed by /v1/responses as
image_generation_call.result. Use /v1/responses for clients that need the
base64 image bytes.
The optional bridge runs on a separate listener, defaulting to http://127.0.0.1:11434.
It is disabled by default and can be controlled through Dashboard settings or the admin
API. Ollama endpoints are intentionally unauthenticated; keep the listener bound to
localhost unless you explicitly trust the network.
Browser CORS access is restricted to loopback origins (localhost, 127.x.x.x,
and ::1) so non-local web pages cannot read bridge responses by default. The
bridge injects the configured Codex Proxy API key for /v1/* passthrough
requests, so exposing it beyond localhost also exposes the main proxy API
without requiring clients to know that key.
| Method | Path | Description |
|---|---|---|
| GET | /api/version |
Version probe → { version } |
| GET | /api/tags |
Model list in Ollama format |
| POST | /api/show |
Model metadata and capabilities |
| POST | /api/chat |
Chat completions, streaming as NDJSON by default |
| Any | /v1/* |
OpenAI-compatible passthrough to the main proxy |
// POST http://127.0.0.1:11434/api/chat
{
"model": "codex",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true,
"think": "medium" // optional: false | true | low | medium | high | xhigh
}Supported request mappings:
| Ollama field | Upstream OpenAI field |
|---|---|
messages[].images |
content[].image_url data URLs |
tools |
tools |
think |
reasoning_effort |
format: "json" |
response_format: { type: "json_object" } |
format: { ... } |
strict JSON schema response format |
options.temperature |
temperature |
options.top_p |
top_p |
options.num_predict |
max_tokens |
| Method | Path | Description |
|---|---|---|
| GET | /v1/models |
List models (OpenAI format) |
| GET | /v1/models/catalog |
Full catalog with reasoning efforts |
| GET | /v1/models/:id |
Single model detail |
| GET | /v1/models/:id/info |
Extended model info |
| GET | /v1beta/models |
List models (Gemini format) |
| POST | /admin/refresh-models |
Force refresh from upstream |
Model catalog entries can include token metadata:
| Field | Meaning |
|---|---|
contextWindow |
Static or backend-provided context window for display and client hints |
maxContextWindow |
Backend-provided maximum expandable context window, when reported |
maxOutputTokens |
Static or backend-provided maximum output tokens for display and client hints |
truncationPolicyLimit |
Backend-provided truncation policy limit, when reported |
Static catalog values are defined in config/models.yaml; dynamic entries from
/backend-api/codex/models win when the same model ID is returned by upstream.
On 2026-05-08, real Codex backend metadata returned context_window=272000,
max_context_window=272000, truncation_policy.limit=10000 for gpt-5.5, and
context_window=272000, max_context_window=1000000,
truncation_policy.limit=10000 for gpt-5.4. Treat these as runtime Codex
limits, not as proof that request-level context or max-token switches are
supported.
| Method | Path | Description |
|---|---|---|
| GET | /auth/accounts |
List all accounts |
| POST | /auth/accounts |
Add single account ({ token?, refreshToken? }) |
| DELETE | /auth/accounts/:id |
Delete account |
| PATCH | /auth/accounts/:id/label |
Set label ({ label }) |
| Method | Path | Description |
|---|---|---|
| POST | /auth/accounts/import |
Bulk import ({ accounts: [{token?, refreshToken?, label?}] }) |
| POST | /auth/accounts/batch-delete |
Bulk delete ({ ids: [] }) |
| POST | /auth/accounts/batch-status |
Bulk enable/disable ({ ids: [], status: "active"|"disabled" }) |
| Method | Path | Description |
|---|---|---|
| POST | /auth/accounts/health-check |
Check accounts ({ ids?, stagger_ms?, concurrency? }) |
| POST | /auth/accounts/:id/refresh |
Refresh single account |
| GET | /auth/accounts/:id/quota |
Get quota/usage |
| POST | /auth/accounts/:id/reset-usage |
Reset usage counters |
| Method | Path | Description |
|---|---|---|
| GET | /auth/accounts/export |
Export accounts (?ids=a,b&format=minimal) |
| Method | Path | Description |
|---|---|---|
| GET | /auth/accounts/:id/cookies |
Get stored cookies |
| POST | /auth/accounts/:id/cookies |
Set cookies ({ cookies }) |
| DELETE | /auth/accounts/:id/cookies |
Clear cookies |
| Method | Path | Description |
|---|---|---|
| POST | /auth/login-start |
Start OAuth → { authUrl, state } |
| GET | /auth/login |
302 redirect to Auth0 |
| POST | /auth/code-relay |
OAuth code exchange ({ callbackUrl }) |
| GET | /auth/callback |
OAuth callback handler |
| POST | /auth/device-login |
Start device code flow |
| GET | /auth/device-poll/:deviceCode |
Poll device authorization |
| POST | /auth/import-cli |
Import from Codex CLI auth.json |
| POST | /auth/token |
Manual token submit |
| GET | /auth/status |
Auth status + pool summary |
| POST | /auth/logout |
Clear all accounts |
| Method | Path | Description |
|---|---|---|
| GET | /api/proxies |
List proxies with health & assignments |
| POST | /api/proxies |
Add proxy ({ url } or { host, port, username, password }) |
| PUT | /api/proxies/:id |
Update proxy |
| DELETE | /api/proxies/:id |
Delete proxy |
| Method | Path | Description |
|---|---|---|
| POST | /api/proxies/:id/check |
Health check single proxy |
| POST | /api/proxies/check-all |
Health check all proxies |
| POST | /api/proxies/:id/enable |
Enable proxy |
| POST | /api/proxies/:id/disable |
Disable proxy |
| Method | Path | Description |
|---|---|---|
| GET | /api/proxies/assignments |
List all assignments |
| POST | /api/proxies/assign |
Assign proxy to account ({ accountId, proxyId }) |
| DELETE | /api/proxies/assign/:accountId |
Unassign |
| POST | /api/proxies/assign-bulk |
Bulk assign ({ assignments: [] }) |
| POST | /api/proxies/assign-rule |
Auto-assign by rule ({ rule: "round-robin", ... }) |
| Method | Path | Description |
|---|---|---|
| GET | /api/proxies/export |
Export as YAML |
| POST | /api/proxies/import |
Import YAML or plain text (host:port:user:pass) |
| GET | /api/proxies/assignments/export |
Export assignments |
| POST | /api/proxies/assignments/import |
Preview assignment import |
| POST | /api/proxies/assignments/apply |
Apply assignment import |
| Method | Path | Description |
|---|---|---|
| PUT | /api/proxies/settings |
Update health check interval |
| Method | Path | Description |
|---|---|---|
| GET | /admin/general-settings |
Get all settings |
| POST | /admin/general-settings |
Update settings (returns restart_required) |
| GET | /admin/settings |
Get proxy API key |
| POST | /admin/settings |
Set proxy API key |
| GET | /admin/rotation-settings |
Get rotation strategy |
| POST | /admin/rotation-settings |
Set rotation strategy |
| GET | /admin/quota-settings |
Get quota settings |
| POST | /admin/quota-settings |
Set quota settings |
| GET | /admin/ollama-settings |
Get Ollama Bridge settings plus runtime status |
| POST | /admin/ollama-settings |
Persist Ollama Bridge settings and restart the bridge |
| GET | /admin/ollama-status |
Get Ollama Bridge runtime status |
| Method | Path | Description |
|---|---|---|
| GET | /health |
Health probe → { status, authenticated, pool } |
| POST | /admin/test-connection |
Full connectivity diagnostics |
| GET | /debug/fingerprint |
TLS fingerprint config (localhost only) |
| GET | /debug/diagnostics |
System diagnostics (localhost only) |
| GET | /debug/models |
Model store internals |
Optional bridge to a local official codex app-server instance. This is the
path for using official Codex app plugins such as the Chrome/browser plugin.
It is disabled by default with official_agent.enabled: false.
All endpoints require official_agent.api_key; the bridge refuses requests
when the dedicated official-agent API key is not configured. Do not reuse
server.proxy_api_key here, because this bridge can drive local app-server
plugins and approval flows.
| Method | Path | Purpose |
|---|---|---|
| GET | /official-agent/apps |
List official Codex apps/connectors from app/list |
| POST | /official-agent/threads |
Start an app-server thread ({ model?, cwd? }) |
| POST | /official-agent/threads/:threadId/turns |
Start a turn and stream app-server notifications as SSE |
approvalPolicy, when provided on a turn, must be one of untrusted,
on-request, on-failure, or never.
Example turn using an official Chrome app mention:
{
"text": "Open localhost:8080 and inspect the dashboard",
"app": { "id": "chrome", "name": "Chrome" }
}The bridge sends a text item plus a mention item with path: "app://{id}". Use /official-agent/apps to discover the real app id before
hard-coding one.
| Method | Path | Description |
|---|---|---|
| GET | /admin/update-status |
Check update availability |
| POST | /admin/check-update |
Trigger update check |
| POST | /admin/apply-update |
Apply self-update (SSE progress stream) |
| Method | Path | Description |
|---|---|---|
| GET | /admin/usage-stats/summary |
Cumulative usage by account/model |
| GET | /admin/usage-stats/history |
Time-series data (?granularity=hourly&hours=24) |
| Method | Path | Description |
|---|---|---|
| GET | /auth/quota/warnings |
Active quota warnings |
When quota.skip_exhausted is enabled, account acquisition filters out active
accounts whose cached quota has rate_limit.limit_reached === true,
secondary_rate_limit.limit_reached === true, or
code_review_rate_limit.limit_reached === true. This happens before session
affinity, so preferredEntryId cannot keep a request on an exhausted account.
Near-full quota such as used_percent=99 is not skipped until upstream marks
limit_reached or the account receives a 429 and enters rate_limited backoff.
Secondary and code-review cache windows are cleared after their own reset_at
passes.
| Method | Path | Description |
|---|---|---|
| POST | /auth/dashboard-login |
Login with password → sets session cookie (rate limited: 5/min) |
| POST | /auth/dashboard-logout |
Clear session |
| GET | /auth/dashboard-status |
Check if login required |
Each protocol returns errors in its native format:
| Protocol | Format |
|---|---|
| OpenAI | { error: { message, type, code, param } } |
| Anthropic | { type: "error", error: { type, message } } |
| Gemini | { error: { code, message, status } } |
| Responses | { type: "error", error: { type, code, message } } |
| Admin | { error: "..." } |
Common HTTP status codes: 401 (not authenticated), 429 (rate limited), 503 (no available accounts).