fix: recover auto chat mode when quota state stalls by double2tea · Pull Request #472 · chenyme/grok2api

double2tea · 2026-04-13T14:10:02Z

Summary

add a shared account-selection helper for chat handlers
fall back from AUTO to FAST and then EXPERT when chat quota state is stale
trigger one throttled on-demand quota refresh before returning No available accounts
cover the new selection behavior with focused unit tests

Problem

Some deployments can get stuck returning No available accounts for this model tier for chat requests using AUTO models even though quota becomes available again after a manual refresh or a process restart.

In practice there were two missing recovery paths:

chat handlers treated AUTO as a hard mode and did not try other chat windows that were still available
when the in-memory/runtime quota state said no account was available, request handling returned immediately instead of forcing a fresh quota sync and retrying once

This makes the service dependent on long periodic refresh intervals or manual intervention.

What This Changes

1. Chat-mode fallback

For chat models whose public mode is AUTO, account selection now tries:

AUTO
FAST
EXPERT

This behavior is enabled by default and guarded by:

features.auto_chat_mode_fallback = true

2. Empty-pool refresh retry

If account selection still finds no candidate, the service now:

runs one throttled refresh_on_demand()
retries account selection once

This behavior is enabled by default and guarded by:

features.on_empty_retry_enabled = true

3. Shared implementation

The recovery logic is centralized in a shared products-layer helper so OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages stay consistent.

Why This Is Safe

the fallback is limited to chat requests only
image/video behavior is unchanged
the on-demand refresh path is already throttled by existing refresh-service logic
the feature flags keep the behavior configurable

Verification

./.venv/bin/python -m unittest tests.test_account_selection
./.venv/bin/python -m py_compile app/products/_account_selection.py tests/test_account_selection.py app/products/openai/chat.py app/products/openai/responses.py app/products/anthropic/messages.py

Notes

I observed this against a live deployment where:

cached AUTO quota had stalled at zero
FAST / EXPERT quota was still available
a manual refresh or service restart restored AUTO

This patch addresses both the stale-state case and the no-self-recovery case.

fix: recover auto chat mode when quota state stalls

b237714

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: recover auto chat mode when quota state stalls#472

fix: recover auto chat mode when quota state stalls#472
double2tea wants to merge 1 commit intochenyme:mainfrom
double2tea:fix/auto-chat-quota-recovery

double2tea commented Apr 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

double2tea commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

What This Changes

1. Chat-mode fallback

2. Empty-pool refresh retry

3. Shared implementation

Why This Is Safe

Verification

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

double2tea commented Apr 13, 2026 •

edited

Loading