-
Notifications
You must be signed in to change notification settings - Fork 140
refactor(llm): use litellm params for reasoning support #1990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Derive reasoning_effort support from get_supported_openai_params and update Nova 2 Lite expectations in tests. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Acceptable - Works but introduces risks without adequate safeguards
Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.
Verdict: ❌ Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.
Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.
| "nova-2-lite", | ||
| ] | ||
|
|
||
| @lru_cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Important: Unbounded cache is a memory leak waiting to happen.
@lru_cache without maxsize means this cache grows forever. If users query many unique model names (e.g., dynamic model routing, experimentation, or malicious input), this will consume unbounded memory.
| @lru_cache | |
| @lru_cache(maxsize=256) |
Why 256? Covers all practical models while preventing DoS. If you need more, prove it with data.
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Acceptable - Works but introduces risks without adequate safeguards
Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.
Verdict: ❌ Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.
Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test comment for line 80
| params = get_supported_openai_params( | ||
| model=model_id, custom_llm_provider=custom_provider | ||
| ) | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test comment for line 83
| except Exception: | ||
| return False | ||
|
|
||
| return bool(params and "reasoning_effort" in params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test comment for line 4
| from dataclasses import dataclass | ||
| from functools import lru_cache | ||
|
|
||
| from litellm import get_supported_openai_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test comment for test file line 35
| # Anthropic Opus 4.5 (dash variant only) | ||
| ("claude-opus-4-5", True), | ||
| ("nova-2-lite", True), | ||
| ("nova-2-lite", False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Critical Issues
Following up on my initial review with remaining inline comments:
| params = get_supported_openai_params( | ||
| model=model_id, custom_llm_provider=custom_provider | ||
| ) | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Critical: Bare except Exception: swallows programming errors, not just API failures.
This catches EVERYTHING: AttributeError, TypeError, and other programming bugs. If get_supported_openai_params has a bug or you pass wrong types, this silently returns False and masks the real problem.
| except Exception: | |
| except (ValueError, KeyError, ImportError, RuntimeError): | |
| # LiteLLM API errors - model not found, API failure, etc. | |
| return False |
Fail fast on programming errors. Silent failures are how bugs hide for months.
| except Exception: | ||
| return False | ||
|
|
||
| return bool(params and "reasoning_effort" in params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Suggestion: Redundant bool() call - the expression already returns bool.
params and "reasoning_effort" in params already evaluates to bool. The bool() wrapper adds zero value and obscures intent.
| return bool(params and "reasoning_effort" in params) | |
| return params is not None and "reasoning_effort" in params |
Clearer: explicitly check params is not None, then check membership. No type coercion theater.
| from dataclasses import dataclass | ||
| from functools import lru_cache | ||
|
|
||
| from litellm import get_supported_openai_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Important: External dependency risk - you now trust LiteLLM to be correct.
The old hardcoded list was under YOUR control. You audited it, tested it, owned it. Now if LiteLLM's get_supported_openai_params is wrong, you're wrong.
Questions:
- What happens when LiteLLM adds a model you don't support yet?
- What happens when they remove a model you still support?
- What happens when their API changes?
This is a pragmatic trade-off (less maintenance) but you've outsourced correctness. Document this risk and have a fallback plan.
| # Anthropic Opus 4.5 (dash variant only) | ||
| ("claude-opus-4-5", True), | ||
| ("nova-2-lite", True), | ||
| ("nova-2-lite", False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Important: Breaking behavior change without migration notes.
nova-2-lite went from True → False. Was the old code lying (bug fix), or is this a breaking change? If users rely on this, their code breaks.
Document this:
- Add a comment explaining WHY it changed
- Consider deprecation warning if this breaks users
- Or prove that nova-2-lite never actually supported reasoning_effort (in which case, this is a bug fix)
"Never break userspace" - if this changes user-visible behavior, explain why it's justified.
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Critical Issues
Following up on my initial review with remaining inline comments:
| params = get_supported_openai_params( | ||
| model=model_id, custom_llm_provider=custom_provider | ||
| ) | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Critical: Bare except Exception: swallows programming errors, not just API failures.
This catches EVERYTHING: AttributeError, TypeError, and other programming bugs. If get_supported_openai_params has a bug or you pass wrong types, this silently returns False and masks the real problem.
| except Exception: | |
| except (ValueError, KeyError, ImportError, RuntimeError): | |
| # LiteLLM API errors - model not found, API failure, etc. | |
| return False |
Fail fast on programming errors. Silent failures are how bugs hide for months.
| except Exception: | ||
| return False | ||
|
|
||
| return bool(params and "reasoning_effort" in params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Suggestion: Redundant bool() call - the expression already returns bool.
params and "reasoning_effort" in params already evaluates to bool. The bool() wrapper adds zero value and obscures intent.
| return bool(params and "reasoning_effort" in params) | |
| return params is not None and "reasoning_effort" in params |
Clearer: explicitly check params is not None, then check membership. No type coercion theater.
| from dataclasses import dataclass | ||
| from functools import lru_cache | ||
|
|
||
| from litellm import get_supported_openai_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Important: External dependency risk - you now trust LiteLLM to be correct.
The old hardcoded list was under YOUR control. You audited it, tested it, owned it. Now if LiteLLM's get_supported_openai_params is wrong, you're wrong.
Questions:
- What happens when LiteLLM adds a model you don't support yet?
- What happens when they remove a model you still support?
- What happens when their API changes?
This is a pragmatic trade-off (less maintenance) but you've outsourced correctness. Document this risk and have a fallback plan.
| # Anthropic Opus 4.5 (dash variant only) | ||
| ("claude-opus-4-5", True), | ||
| ("nova-2-lite", True), | ||
| ("nova-2-lite", False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Important: Breaking behavior change without migration notes.
nova-2-lite went from True → False. Was the old code lying (bug fix), or is this a breaking change? If users rely on this, their code breaks.
Document this:
- Add a comment explaining WHY it changed
- Consider deprecation warning if this breaks users
- Or prove that nova-2-lite never actually supported reasoning_effort (in which case, this is a bug fix)
"Never break userspace" - if this changes user-visible behavior, explain why it's justified.
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Testing Gaps - Critical Missing Coverage
You added complex logic (prefix stripping, provider splitting, exception handling, LRU cache) but only test the happy path through get_features().
Missing test coverage for _supports_reasoning_effort():
-
Error handling: What happens when
get_supported_openai_paramsraises an exception? Your bareexcept Exception:hides this. -
LRU cache behavior: Does it actually work? Test: Call same model twice, verify only one LiteLLM call made.
-
Edge cases:
- Empty string:
_supports_reasoning_effort("") - None:
_supports_reasoning_effort(None) - Whitespace-only:
_supports_reasoning_effort(" ") - Malformed provider:
_supports_reasoning_effort("a/b/c/d")
- Empty string:
-
Prefix handling: Does
litellm_proxy/prefix stripping work correctly?_supports_reasoning_effort("litellm_proxy/o1")_supports_reasoning_effort("litellm_proxy/openai/o1")
-
Custom provider splitting: Does it handle multiple slashes correctly?
Tests should prove the code works, not just assert it was called. Add unit tests for _supports_reasoning_effort directly (yes, it's private, but it has complex logic that deserves dedicated tests).
Bottom line: Without these tests, you're flying blind. When LiteLLM changes or edge cases hit production, you'll have no safety net.
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Testing Gaps - Critical Missing Coverage
You added complex logic (prefix stripping, provider splitting, exception handling, LRU cache) but only test the happy path through get_features().
Missing test coverage for _supports_reasoning_effort():
-
Error handling: What happens when
get_supported_openai_paramsraises an exception? Your bareexcept Exception:hides this. -
LRU cache behavior: Does it actually work? Test: Call same model twice, verify only one LiteLLM call made.
-
Edge cases:
- Empty string:
_supports_reasoning_effort("") - None:
_supports_reasoning_effort(None) - Whitespace-only:
_supports_reasoning_effort(" ") - Malformed provider:
_supports_reasoning_effort("a/b/c/d")
- Empty string:
-
Prefix handling: Does
litellm_proxy/prefix stripping work correctly?_supports_reasoning_effort("litellm_proxy/o1")_supports_reasoning_effort("litellm_proxy/openai/o1")
-
Custom provider splitting: Does it handle multiple slashes correctly?
Tests should prove the code works, not just assert it was called. Add unit tests for _supports_reasoning_effort directly (yes, it's private, but it has complex logic that deserves dedicated tests).
Bottom line: Without these tests, you're flying blind. When LiteLLM changes or edge cases hit production, you'll have no safety net.
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
Summary
Testing