Skip to content

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Feb 10, 2026

Summary

  • derive reasoning_effort support from get_supported_openai_params and remove hardcoded list
  • normalize litellm_proxy-prefixed models before checking supported params
  • update Nova 2 Lite reasoning expectations in tests

Testing

  • uv run pre-commit run --files openhands-sdk/openhands/sdk/llm/utils/model_features.py
  • uv run pre-commit run --files tests/sdk/llm/test_model_features.py

Derive reasoning_effort support from get_supported_openai_params and
update Nova 2 Lite expectations in tests.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable - Works but introduces risks without adequate safeguards

Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.

Verdict:Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.

Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.

"nova-2-lite",
]

@lru_cache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Unbounded cache is a memory leak waiting to happen.

@lru_cache without maxsize means this cache grows forever. If users query many unique model names (e.g., dynamic model routing, experimentation, or malicious input), this will consume unbounded memory.

Suggested change
@lru_cache
@lru_cache(maxsize=256)

Why 256? Covers all practical models while preventing DoS. If you need more, prove it with data.

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable - Works but introduces risks without adequate safeguards

Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.

Verdict:Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.

Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test comment for line 80

params = get_supported_openai_params(
model=model_id, custom_llm_provider=custom_provider
)
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test comment for line 83

except Exception:
return False

return bool(params and "reasoning_effort" in params)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test comment for line 4

from dataclasses import dataclass
from functools import lru_cache

from litellm import get_supported_openai_params
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test comment for test file line 35

# Anthropic Opus 4.5 (dash variant only)
("claude-opus-4-5", True),
("nova-2-lite", True),
("nova-2-lite", False),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Critical Issues

Following up on my initial review with remaining inline comments:

params = get_supported_openai_params(
model=model_id, custom_llm_provider=custom_provider
)
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: Bare except Exception: swallows programming errors, not just API failures.

This catches EVERYTHING: AttributeError, TypeError, and other programming bugs. If get_supported_openai_params has a bug or you pass wrong types, this silently returns False and masks the real problem.

Suggested change
except Exception:
except (ValueError, KeyError, ImportError, RuntimeError):
# LiteLLM API errors - model not found, API failure, etc.
return False

Fail fast on programming errors. Silent failures are how bugs hide for months.

except Exception:
return False

return bool(params and "reasoning_effort" in params)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Redundant bool() call - the expression already returns bool.

params and "reasoning_effort" in params already evaluates to bool. The bool() wrapper adds zero value and obscures intent.

Suggested change
return bool(params and "reasoning_effort" in params)
return params is not None and "reasoning_effort" in params

Clearer: explicitly check params is not None, then check membership. No type coercion theater.

from dataclasses import dataclass
from functools import lru_cache

from litellm import get_supported_openai_params
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: External dependency risk - you now trust LiteLLM to be correct.

The old hardcoded list was under YOUR control. You audited it, tested it, owned it. Now if LiteLLM's get_supported_openai_params is wrong, you're wrong.

Questions:

  1. What happens when LiteLLM adds a model you don't support yet?
  2. What happens when they remove a model you still support?
  3. What happens when their API changes?

This is a pragmatic trade-off (less maintenance) but you've outsourced correctness. Document this risk and have a fallback plan.

# Anthropic Opus 4.5 (dash variant only)
("claude-opus-4-5", True),
("nova-2-lite", True),
("nova-2-lite", False),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Breaking behavior change without migration notes.

nova-2-lite went from True → False. Was the old code lying (bug fix), or is this a breaking change? If users rely on this, their code breaks.

Document this:

  • Add a comment explaining WHY it changed
  • Consider deprecation warning if this breaks users
  • Or prove that nova-2-lite never actually supported reasoning_effort (in which case, this is a bug fix)

"Never break userspace" - if this changes user-visible behavior, explain why it's justified.

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Critical Issues

Following up on my initial review with remaining inline comments:

params = get_supported_openai_params(
model=model_id, custom_llm_provider=custom_provider
)
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: Bare except Exception: swallows programming errors, not just API failures.

This catches EVERYTHING: AttributeError, TypeError, and other programming bugs. If get_supported_openai_params has a bug or you pass wrong types, this silently returns False and masks the real problem.

Suggested change
except Exception:
except (ValueError, KeyError, ImportError, RuntimeError):
# LiteLLM API errors - model not found, API failure, etc.
return False

Fail fast on programming errors. Silent failures are how bugs hide for months.

except Exception:
return False

return bool(params and "reasoning_effort" in params)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Redundant bool() call - the expression already returns bool.

params and "reasoning_effort" in params already evaluates to bool. The bool() wrapper adds zero value and obscures intent.

Suggested change
return bool(params and "reasoning_effort" in params)
return params is not None and "reasoning_effort" in params

Clearer: explicitly check params is not None, then check membership. No type coercion theater.

from dataclasses import dataclass
from functools import lru_cache

from litellm import get_supported_openai_params
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: External dependency risk - you now trust LiteLLM to be correct.

The old hardcoded list was under YOUR control. You audited it, tested it, owned it. Now if LiteLLM's get_supported_openai_params is wrong, you're wrong.

Questions:

  1. What happens when LiteLLM adds a model you don't support yet?
  2. What happens when they remove a model you still support?
  3. What happens when their API changes?

This is a pragmatic trade-off (less maintenance) but you've outsourced correctness. Document this risk and have a fallback plan.

# Anthropic Opus 4.5 (dash variant only)
("claude-opus-4-5", True),
("nova-2-lite", True),
("nova-2-lite", False),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Breaking behavior change without migration notes.

nova-2-lite went from True → False. Was the old code lying (bug fix), or is this a breaking change? If users rely on this, their code breaks.

Document this:

  • Add a comment explaining WHY it changed
  • Consider deprecation warning if this breaks users
  • Or prove that nova-2-lite never actually supported reasoning_effort (in which case, this is a bug fix)

"Never break userspace" - if this changes user-visible behavior, explain why it's justified.

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Testing Gaps - Critical Missing Coverage

You added complex logic (prefix stripping, provider splitting, exception handling, LRU cache) but only test the happy path through get_features().

Missing test coverage for _supports_reasoning_effort():

  1. Error handling: What happens when get_supported_openai_params raises an exception? Your bare except Exception: hides this.

  2. LRU cache behavior: Does it actually work? Test: Call same model twice, verify only one LiteLLM call made.

  3. Edge cases:

    • Empty string: _supports_reasoning_effort("")
    • None: _supports_reasoning_effort(None)
    • Whitespace-only: _supports_reasoning_effort(" ")
    • Malformed provider: _supports_reasoning_effort("a/b/c/d")
  4. Prefix handling: Does litellm_proxy/ prefix stripping work correctly?

    • _supports_reasoning_effort("litellm_proxy/o1")
    • _supports_reasoning_effort("litellm_proxy/openai/o1")
  5. Custom provider splitting: Does it handle multiple slashes correctly?

Tests should prove the code works, not just assert it was called. Add unit tests for _supports_reasoning_effort directly (yes, it's private, but it has complex logic that deserves dedicated tests).

Bottom line: Without these tests, you're flying blind. When LiteLLM changes or edge cases hit production, you'll have no safety net.

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Testing Gaps - Critical Missing Coverage

You added complex logic (prefix stripping, provider splitting, exception handling, LRU cache) but only test the happy path through get_features().

Missing test coverage for _supports_reasoning_effort():

  1. Error handling: What happens when get_supported_openai_params raises an exception? Your bare except Exception: hides this.

  2. LRU cache behavior: Does it actually work? Test: Call same model twice, verify only one LiteLLM call made.

  3. Edge cases:

    • Empty string: _supports_reasoning_effort("")
    • None: _supports_reasoning_effort(None)
    • Whitespace-only: _supports_reasoning_effort(" ")
    • Malformed provider: _supports_reasoning_effort("a/b/c/d")
  4. Prefix handling: Does litellm_proxy/ prefix stripping work correctly?

    • _supports_reasoning_effort("litellm_proxy/o1")
    • _supports_reasoning_effort("litellm_proxy/openai/o1")
  5. Custom provider splitting: Does it handle multiple slashes correctly?

Tests should prove the code works, not just assert it was called. Add unit tests for _supports_reasoning_effort directly (yes, it's private, but it has complex logic that deserves dedicated tests).

Bottom line: Without these tests, you're flying blind. When LiteLLM changes or edge cases hit production, you'll have no safety net.

Co-authored-by: OpenHands Bot <contact@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Feb 10, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Run tests
    • Agent Server
    • Review Thread Gate

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1990 at branch `cleanup-reasoning-models`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants