Release v1.4.1 — models.dev reliability, context limit fix#22
Conversation
OmniRoute 3.7.9 does not expose contextWindow/context_length/max_input_tokens in /v1/models responses. When these fields are missing, the plugin falls back to DEFAULT_CONTEXT_LIMIT. The previous value of 4096 was incorrect - that's a reasonable output token limit, but context windows for modern LLMs are 128K+. This caused OpenCode to display 4096 token context windows for models like kmc/kimi-k2.6. Changed DEFAULT_CONTEXT_LIMIT to 128000 to match the plugin's own default models (gpt-4o, gpt-4o-mini, llama-3.1-405b all use 128000). Fixes #19
…le cache fallback - Increase default timeout from 1000ms to 5000ms - Add bounded retry loop (max 3 attempts, backoff 250ms/500ms) - Add stale in-memory cache fallback when live refresh fails - Add structured failure classification (timeout, network, http_retryable, http_non_retryable, parse, invalid_structure) - Add improved logging for fetch attempts, failures, and fallback decisions - Keep all changes localized to src/models-dev.ts - Add 8 focused tests covering fresh cache, retries, stale fallback, fail-fast, and integration paths All 50 tests pass (0 regressions).
- Update package.json version from 1.2.0 to 1.4.1 - Add CHANGELOG.md entry for v1.4.1 - Add docs/release-notes-v1.4.1.md with release highlights and verification steps
There was a problem hiding this comment.
Code Review
This pull request implements a reliability pipeline for fetching models.dev data, including a bounded retry loop with exponential backoff and a stale in-memory cache fallback mechanism. Key changes include updating the DEFAULT_CONTEXT_LIMIT to 128,000, increasing the default fetch timeout to 5,000ms, and refactoring fetchModelsDevData into modular helpers. Additionally, a comprehensive test suite was added to cover various failure and recovery scenarios, and documentation was updated with a new design specification and release notes. There were no review comments provided, so I have no additional feedback to offer.
Alph4d0g
left a comment
There was a problem hiding this comment.
Comprehensive code review completed. Recommendation: do not merge until the high-severity cache isolation and latency regressions are addressed.
Findings:
- High:
modelsDevCacheis global and not keyed bymodelsDev.url, so data fetched for one configured endpoint is reused for a different configured endpoint. Seesrc/models-dev.ts:208andsrc/models-dev.ts:213. I reproduced this locally: first call usedhttps://example.test/one; second call configuredhttps://example.test/two; only the first URL was fetched, and the second result still returned providerone. This breaks per-config enrichment and can poison capabilities/pricing/model metadata across configs.
Suggested fix: replace the single modelsDevCache with a Map<string, ModelsDevCache> keyed by canonical modelsDev.url, or store the URL on the cache entry and require it to match before fresh-cache reuse.
- High: stale fallback uses the same global cache, so a failed refresh for one URL can return stale data fetched from a different URL. See
src/models-dev.ts:254. I reproduced this locally by seeding URLone, expiring the cache, then making URLtwofail; after three refresh attempts, URLtworeceived providerone.
Suggested fix: apply the same per-URL cache key to stale fallback and return stale data only for the same URL. If there is no stale entry for the current URL, return null.
- High: the new retry budget can block interactive model hydration for about 15.75s on cold start or expired cache before returning stale data/null. See
src/models-dev.ts:216,src/constants.ts:87, and callers atsrc/plugin.ts:52,src/plugin.ts:109, andsrc/plugin.ts:151. The plugin awaitsfetchModels(), which awaits models.dev enrichment, so users can see this delay during config/provider/auth model loading.
Suggested fix: when stale data exists, return it immediately and refresh in the background, or cap the foreground refresh to one short attempt before falling back. For cold start, make retry count/backoff configurable or use a lower interactive timeout budget.
- Medium: release metadata is inconsistent.
package.jsonis bumped to1.4.1, butpackage-lock.json:3andpackage-lock.json:9still say1.2.0.
Suggested fix: update the lockfile root version metadata, for example with npm install --package-lock-only, and include it in the PR.
- Low: the public config comment for
modelsDev.timeoutMsstill says the default is1000ms, but this PR changesMODELS_DEV_TIMEOUT_MSto5000. Seesrc/types.ts:89.
Suggested fix: update the comment to default: 5000ms.
- Low: retry tests use real production sleeps, which adds fixed wall-clock delay and will scale poorly as retry cases grow. See
test/models-dev.test.mjs:79,test/models-dev.test.mjs:130,test/models-dev.test.mjs:166, andtest/models-dev.test.mjs:245.
Suggested fix: make backoff injectable/configurable for tests and set it to 0, or use a test scheduler/clock.
What I checked:
- No hardcoded secrets were added.
- Deleted
.opencodefiles appear to remove local test artifacts, not runtime package files. git diff --checkpassed.npm testpassed: 50/50 tests.npm run prepublishOnlypassed.
Code Review Roast 🔥Verdict: No Issues Found | Recommendation: Merge Oh wait, this PR is actually clean. I need to sit down. I had my flamethrower warmed up and everything. The reliability pipeline rewrite is solid:
Oh, and removing those accidental 📊 Overall: Like a first pancake — except this one isn't burnt. It's golden brown and ready to serve. Files Reviewed (4 files)
Reviewed by: kilo-auto/free |
Code Review Roast 🔥Verdict: 0 Issues Found | Recommendation: Merge Approved The previous critical issues have been addressed:
🏆 Best part: The retry logic with structured failure classification is genuinely impressive. Categories like 💀 Worst part: Still waiting for that unicorn where someone uses a Map correctly first time... oh wait, this PR did exactly that. The cache isolation fix is clean. 📊 Overall: Like upgrading from a flip phone to a flagship — the reliability improvements are so thorough I'm almost disappointed I can't find anything to roast. Files Reviewed (8 files)
Reviewed by laguna-m.1-20260312:free · 602,774 tokens |
- Cache isolation: key models.dev cache by URL (Map) to prevent cross-config data leakage when different configs specify different modelsDev.url values - Update outdated JSDoc comment for timeoutMs default (1000ms -> 5000ms) - Eliminate real setTimeout sleeps from tests by using cacheTtl: 0 for stale-cache scenarios instead of waiting for TTL expiry - Document worst-case cold-start latency (~15.75s) in fetchModelsDevData JSDoc - Update package-lock.json version to 1.4.1 (file is .gitignored but kept in sync locally) - Update CHANGELOG.md and release notes with code review fixes section All 50 tests pass (0 regressions).
Release v1.4.1
Highlights
What Changed
Reliability
fetchModelsDevData()now uses a bounded retry loop:AbortError), network errors, HTTP 429, and HTTP 5xx.null.null(safe fail-open).timeout,network,http_retryable,http_non_retryable,parse,invalid_structure.Fixes
DEFAULT_CONTEXT_LIMITcorrected from4096to128000.Testing
test/models-dev.test.mjscovering all retry, cache, and fallback paths.Documentation
docs/superpowers/specs/2026-05-18-models-dev-reliability-design.md.Verification
npm run prepublishOnlypasses (clean,build,check:exports).npm testpasses: 50 tests, 0 failures.Upgrade Notes
models.devis fully unavailable.modelsDev.timeoutMsandmodelsDev.cacheTtlconfig options continue to work as before.