Skip to content

feat: make LLM call timeout configurable via env var#58

Merged
AVIDS2 merged 9 commits intoAVIDS2:mainfrom
RaviTharuma:feat/configurable-llm-timeouts
Apr 9, 2026
Merged

feat: make LLM call timeout configurable via env var#58
AVIDS2 merged 9 commits intoAVIDS2:mainfrom
RaviTharuma:feat/configurable-llm-timeouts

Conversation

@RaviTharuma
Copy link
Copy Markdown
Contributor

Summary

  • Add MEMORIX_LLM_TIMEOUT_MS environment variable to control LLM call timeouts (default: 30000ms / 30s)
  • Apply AbortSignal.timeout() to both OpenAI-compatible and Anthropic fetch calls in src/llm/provider.ts
  • Previous behavior: hardcoded 12s timeout injected at build time, not configurable

Problem

When using LLM providers behind a proxy (e.g., CLIProxyAPI/Quotio routing to Chutes/Fireworks), the routing + model cold start regularly exceeds 12 seconds. This causes:

  • Endless LLM compact failed, falling back to heuristic log spam
  • Process instability — timeout cascades can crash the background control plane
  • LLM enhanced mode effectively disabled (every call falls back to heuristic)

See #57 for full details.

Usage

# Set custom timeout (e.g., 60 seconds for slow proxy chains)
export MEMORIX_LLM_TIMEOUT_MS=60000

Closes #57

…nv var

Add AbortSignal.timeout to both OpenAI-compatible and Anthropic fetch
calls in provider.ts, controlled by the MEMORIX_LLM_TIMEOUT_MS
environment variable (default: 30000ms).

This fixes timeouts when using LLM providers behind proxies
(e.g. CLIProxyAPI routing to Chutes/Fireworks) where routing + model
cold start regularly exceeds the previous hardcoded 12s timeout.

Closes AVIDS2#57
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e2d3fb8ecf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

* Configurable via MEMORIX_LLM_TIMEOUT_MS environment variable.
* Default: 30000ms (30s) — allows for proxy routing and cold starts.
*/
const LLM_CALL_TIMEOUT_MS = parseInt(process.env.MEMORIX_LLM_TIMEOUT_MS || '30000', 10);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate timeout env var before passing to AbortSignal

LLM_CALL_TIMEOUT_MS is parsed once with parseInt(...) and used directly in AbortSignal.timeout(...) without validation. If MEMORIX_LLM_TIMEOUT_MS is set to a non-integer or out-of-range value (for example 60s, empty string, -1, or a very large number), Node throws a RangeError, causing every LLM request to fail before the HTTP call is made. This turns a configuration typo into a full LLM-path outage instead of safely falling back to the default timeout.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner

@AVIDS2 AVIDS2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. The direction is useful, but this PR still needs one important fix before we can take it: MEMORIX_LLM_TIMEOUT_MS is parsed and passed directly into AbortSignal.timeout(...) without validation, so values like 60s, -1, or other invalid input can turn a config typo into a hard LLM-path failure. Please validate and clamp/fallback before use. Also, this PR overlaps heavily with #59, so once the validation issue is addressed we’ll likely consolidate toward a single timeout PR instead of merging both.

Copy link
Copy Markdown
Contributor Author

@RaviTharuma RaviTharuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Added input validation for MEMORIX_LLM_TIMEOUT_MS:

  • Extracted parseLLMTimeoutMs() helper that validates the env var is a valid integer
  • Returns default (30s) for undefined, empty, non-numeric, float, or NaN values
  • Clamps valid integers to [1000, 300000] ms range (1s–5min)
  • Added 9 unit tests covering all edge cases

Regarding overlap with #59 — happy to consolidate. Our implementation now has the validation that was requested, so whichever PR you prefer to merge is fine.

MEMORIX_LLM_TIMEOUT_MS is now validated: must be a plain positive integer
in the range 1000–300000ms. Values like "60s", "-1", or NaN fall back to
the default 30000ms with a console.warn explaining the issue.
@RaviTharuma
Copy link
Copy Markdown
Contributor Author

Fixed — added validation: timeout must be a positive integer between 1000-300000ms, falls back to default on invalid input.

@RaviTharuma RaviTharuma requested a review from AVIDS2 April 8, 2026 19:09
TYPE_WEIGHTS map requires all ObservationType keys — add 'reasoning' entry.
archiveExpired document builder requires 'source' field — default to 'agent'.

Fixes remaining TS2741 errors surfaced by tsc --noEmit.
- Add 'reasoning' to TYPE_WEIGHTS in evaluate.ts (ObservationType exhaustiveness)
- Add source field to MemorixDocument object literal in retention.ts
- Change parseLLMTimeoutMs to clamp out-of-range values to min/max
  instead of falling back to default, matching test expectations
@RaviTharuma RaviTharuma closed this Apr 8, 2026
@RaviTharuma RaviTharuma reopened this Apr 8, 2026
@RaviTharuma
Copy link
Copy Markdown
Contributor Author

@AVIDS2 please check now :)

Copy link
Copy Markdown
Owner

@AVIDS2 AVIDS2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good now. The timeout env handling no longer turns invalid input into an AbortSignal.timeout(...) failure path, the validation/clamping behavior is explicit, and the added tests make the edge cases clear. Between the two overlapping timeout PRs, this is the stronger one to carry forward.

@AVIDS2 AVIDS2 merged commit 093980b into AVIDS2:main Apr 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LLM_CALL_TIMEOUT_MS is hardcoded at 12s — causes timeouts with proxy-routed models

2 participants