Skip to content

Release v1.4.1 — models.dev reliability, context limit fix#22

Closed
Alph4d0g wants to merge 6 commits into
mainfrom
release/v1.4.1
Closed

Release v1.4.1 — models.dev reliability, context limit fix#22
Alph4d0g wants to merge 6 commits into
mainfrom
release/v1.4.1

Conversation

@Alph4d0g
Copy link
Copy Markdown
Owner

Release v1.4.1

Highlights

  • models.dev enrichment no longer fails on transient network slowness. The fetch pipeline now retries up to 3 times with exponential backoff and falls back to stale cached data if live refresh fails.
  • Default context limit corrected from 4096 to 128000 tokens to match OmniRoute API behavior.
  • Structured observability for enrichment failures with per-attempt diagnostics and fallback decisions.

What Changed

Reliability

  • fetchModelsDevData() now uses a bounded retry loop:
    • Maximum 3 attempts with 250ms / 500ms backoff.
    • Retries on: timeouts (AbortError), network errors, HTTP 429, and HTTP 5xx.
    • Fail-fast on: HTTP 4xx (non-429) and structurally invalid responses.
  • Stale in-memory cache fallback:
    • If cached data exists but TTL expired, live refresh is attempted first.
    • If all refresh attempts fail, the stale cached data is returned instead of null.
    • If no cache exists and all attempts fail, returns null (safe fail-open).
  • Timeout budget increased from 1000ms to 5000ms per attempt.
  • Failure classification: timeout, network, http_retryable, http_non_retryable, parse, invalid_structure.

Fixes

  • DEFAULT_CONTEXT_LIMIT corrected from 4096 to 128000.

Testing

  • Added 8 focused tests in test/models-dev.test.mjs covering all retry, cache, and fallback paths.
  • Full regression suite: 50/50 tests pass (0 failures).

Documentation

  • Added design spec: docs/superpowers/specs/2026-05-18-models-dev-reliability-design.md.

Verification

  • npm run prepublishOnly passes (clean, build, check:exports).
  • npm test passes: 50 tests, 0 failures.
  • TypeScript strict mode compiles cleanly.

Upgrade Notes

  • No breaking changes. Plugin behavior remains safe when models.dev is fully unavailable.
  • Existing modelsDev.timeoutMs and modelsDev.cacheTtl config options continue to work as before.

Alph4d0g added 4 commits May 17, 2026 17:31
OmniRoute 3.7.9 does not expose contextWindow/context_length/max_input_tokens
in /v1/models responses. When these fields are missing, the plugin falls back
to DEFAULT_CONTEXT_LIMIT.

The previous value of 4096 was incorrect - that's a reasonable output token
limit, but context windows for modern LLMs are 128K+. This caused OpenCode
to display 4096 token context windows for models like kmc/kimi-k2.6.

Changed DEFAULT_CONTEXT_LIMIT to 128000 to match the plugin's own default
models (gpt-4o, gpt-4o-mini, llama-3.1-405b all use 128000).

Fixes #19
…le cache fallback

- Increase default timeout from 1000ms to 5000ms
- Add bounded retry loop (max 3 attempts, backoff 250ms/500ms)
- Add stale in-memory cache fallback when live refresh fails
- Add structured failure classification (timeout, network, http_retryable,
  http_non_retryable, parse, invalid_structure)
- Add improved logging for fetch attempts, failures, and fallback decisions
- Keep all changes localized to src/models-dev.ts
- Add 8 focused tests covering fresh cache, retries, stale fallback,
  fail-fast, and integration paths

All 50 tests pass (0 regressions).
- Update package.json version from 1.2.0 to 1.4.1
- Add CHANGELOG.md entry for v1.4.1
- Add docs/release-notes-v1.4.1.md with release highlights and verification steps
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a reliability pipeline for fetching models.dev data, including a bounded retry loop with exponential backoff and a stale in-memory cache fallback mechanism. Key changes include updating the DEFAULT_CONTEXT_LIMIT to 128,000, increasing the default fetch timeout to 5,000ms, and refactoring fetchModelsDevData into modular helpers. Additionally, a comprehensive test suite was added to cover various failure and recovery scenarios, and documentation was updated with a new design specification and release notes. There were no review comments provided, so I have no additional feedback to offer.

Copy link
Copy Markdown
Owner Author

@Alph4d0g Alph4d0g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comprehensive code review completed. Recommendation: do not merge until the high-severity cache isolation and latency regressions are addressed.

Findings:

  1. High: modelsDevCache is global and not keyed by modelsDev.url, so data fetched for one configured endpoint is reused for a different configured endpoint. See src/models-dev.ts:208 and src/models-dev.ts:213. I reproduced this locally: first call used https://example.test/one; second call configured https://example.test/two; only the first URL was fetched, and the second result still returned provider one. This breaks per-config enrichment and can poison capabilities/pricing/model metadata across configs.

Suggested fix: replace the single modelsDevCache with a Map<string, ModelsDevCache> keyed by canonical modelsDev.url, or store the URL on the cache entry and require it to match before fresh-cache reuse.

  1. High: stale fallback uses the same global cache, so a failed refresh for one URL can return stale data fetched from a different URL. See src/models-dev.ts:254. I reproduced this locally by seeding URL one, expiring the cache, then making URL two fail; after three refresh attempts, URL two received provider one.

Suggested fix: apply the same per-URL cache key to stale fallback and return stale data only for the same URL. If there is no stale entry for the current URL, return null.

  1. High: the new retry budget can block interactive model hydration for about 15.75s on cold start or expired cache before returning stale data/null. See src/models-dev.ts:216, src/constants.ts:87, and callers at src/plugin.ts:52, src/plugin.ts:109, and src/plugin.ts:151. The plugin awaits fetchModels(), which awaits models.dev enrichment, so users can see this delay during config/provider/auth model loading.

Suggested fix: when stale data exists, return it immediately and refresh in the background, or cap the foreground refresh to one short attempt before falling back. For cold start, make retry count/backoff configurable or use a lower interactive timeout budget.

  1. Medium: release metadata is inconsistent. package.json is bumped to 1.4.1, but package-lock.json:3 and package-lock.json:9 still say 1.2.0.

Suggested fix: update the lockfile root version metadata, for example with npm install --package-lock-only, and include it in the PR.

  1. Low: the public config comment for modelsDev.timeoutMs still says the default is 1000ms, but this PR changes MODELS_DEV_TIMEOUT_MS to 5000. See src/types.ts:89.

Suggested fix: update the comment to default: 5000ms.

  1. Low: retry tests use real production sleeps, which adds fixed wall-clock delay and will scale poorly as retry cases grow. See test/models-dev.test.mjs:79, test/models-dev.test.mjs:130, test/models-dev.test.mjs:166, and test/models-dev.test.mjs:245.

Suggested fix: make backoff injectable/configurable for tests and set it to 0, or use a test scheduler/clock.

What I checked:

  • No hardcoded secrets were added.
  • Deleted .opencode files appear to remove local test artifacts, not runtime package files.
  • git diff --check passed.
  • npm test passed: 50/50 tests.
  • npm run prepublishOnly passed.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 19, 2026

Code Review Roast 🔥

Verdict: No Issues Found | Recommendation: Merge

Oh wait, this PR is actually clean. I need to sit down. I had my flamethrower warmed up and everything.

The reliability pipeline rewrite is solid:

  • ✅ Retry logic with exponential backoff is correctly scoped to transient failures (timeout, network, HTTP 429/5xx)
  • ✅ Fail-fast on non-retryable errors (4xx except 429) is the right call
  • ✅ Stale cache fallback preserves functionality when refresh fails
  • ✅ Failure classification with structured logging enables debugging
  • ✅ Tests cover all major scenarios including cache hit, timeout recovery, and fallback paths
  • ✅ Correct DEFAULT_CONTEXT_LIMIT fix from 4096 → 128000 matches actual API

Oh, and removing those accidental .opencode/ config artifacts? Chef's kiss. Nobody wants to accidentally load their local dev config in production.

📊 Overall: Like a first pancake — except this one isn't burnt. It's golden brown and ready to serve.

Files Reviewed (4 files)
  • src/models-dev.ts - Clean reliability pipeline implementation
  • src/constants.ts - Context limit fix and timeout increase
  • test/models-dev.test.mjs - 8 comprehensive test cases
  • docs/superpowers/specs/2026-05-18-models-dev-reliability-design.md - Well-documented design

Reviewed by: kilo-auto/free

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 19, 2026

Code Review Roast 🔥

Verdict: 0 Issues Found | Recommendation: Merge Approved

The previous critical issues have been addressed:

Issue Status
Cache not keyed by URL ✅ Fixed - Now uses Map<string, ModelsDevCache>
Blocking retry loop latency ✅ Documented (~15.75s worst-case, per design spec)
Outdated timeout JSDoc ✅ Fixed from 1000ms to 5000ms
Lockfile version mismatch ✅ Resolved (lockfile removed from repo)
Real setTimeout sleeps in tests ✅ Fixed using cacheTtl: 0

🏆 Best part: The retry logic with structured failure classification is genuinely impressive. Categories like timeout, network, http_retryable, etc. show serious engineering discipline.

💀 Worst part: Still waiting for that unicorn where someone uses a Map correctly first time... oh wait, this PR did exactly that. The cache isolation fix is clean.

📊 Overall: Like upgrading from a flip phone to a flagship — the reliability improvements are so thorough I'm almost disappointed I can't find anything to roast.

Files Reviewed (8 files)
  • src/models-dev.ts - Cache isolation + retry pipeline
  • src/types.ts - JSDoc accuracy fix
  • src/constants.ts - Context limit + timeout fixes
  • test/models-dev.test.mjs - 8 new test cases
  • package.json - Version bump
  • CHANGELOG.md - Release notes
  • docs/* - Design specifications

Reviewed by laguna-m.1-20260312:free · 602,774 tokens

Comment thread src/models-dev.ts Outdated
Comment thread src/models-dev.ts Outdated
Comment thread src/models-dev.ts
Comment thread src/models-dev.ts
Comment thread test/models-dev.test.mjs Outdated
Alph4d0g added 2 commits May 19, 2026 16:11
- Cache isolation: key models.dev cache by URL (Map) to prevent cross-config
  data leakage when different configs specify different modelsDev.url values
- Update outdated JSDoc comment for timeoutMs default (1000ms -> 5000ms)
- Eliminate real setTimeout sleeps from tests by using cacheTtl: 0 for
  stale-cache scenarios instead of waiting for TTL expiry
- Document worst-case cold-start latency (~15.75s) in fetchModelsDevData JSDoc
- Update package-lock.json version to 1.4.1 (file is .gitignored but kept
  in sync locally)
- Update CHANGELOG.md and release notes with code review fixes section

All 50 tests pass (0 regressions).
@Alph4d0g Alph4d0g closed this May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant