refactor(llm): use litellm params for reasoning support #1990

enyst · 2026-02-10T18:12:16Z

Summary

derive reasoning_effort support from get_supported_openai_params and remove hardcoded list
normalize litellm_proxy-prefixed models before checking supported params
update Nova 2 Lite reasoning expectations in tests

Testing

uv run pre-commit run --files openhands-sdk/openhands/sdk/llm/utils/model_features.py
uv run pre-commit run --files tests/sdk/llm/test_model_features.py

Derive reasoning_effort support from get_supported_openai_params and update Nova 2 Lite expectations in tests. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟡 Acceptable - Works but introduces risks without adequate safeguards

Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.

Verdict: ❌ Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.

Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.

all-hands-bot · 2026-02-10T18:16:32Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

-    "nova-2-lite",
-]
+
+@lru_cache


🟠 Important: Unbounded cache is a memory leak waiting to happen.

@lru_cache without maxsize means this cache grows forever. If users query many unique model names (e.g., dynamic model routing, experimentation, or malicious input), this will consume unbounded memory.

Suggested change

@lru_cache

@lru_cache(maxsize=256)

Why 256? Covers all practical models while preventing DoS. If you need more, prove it with data.

all-hands-bot

🟡 Acceptable - Works but introduces risks without adequate safeguards

Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.

Verdict: ❌ Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.

Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.

openhands-sdk/openhands/sdk/llm/utils/model_features.py

all-hands-bot

Test comment for line 80

all-hands-bot · 2026-02-10T18:17:25Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

+        params = get_supported_openai_params(
+            model=model_id, custom_llm_provider=custom_provider
+        )
+    except Exception:


all-hands-bot

Test comment for line 83

all-hands-bot · 2026-02-10T18:17:27Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

+    except Exception:
+        return False
+
+    return bool(params and "reasoning_effort" in params)


all-hands-bot

Test comment for line 4

all-hands-bot · 2026-02-10T18:17:28Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

 from dataclasses import dataclass
+from functools import lru_cache
+
+from litellm import get_supported_openai_params


all-hands-bot

Test comment for test file line 35

all-hands-bot · 2026-02-10T18:17:36Z

tests/sdk/llm/test_model_features.py

        # Anthropic Opus 4.5 (dash variant only)
        ("claude-opus-4-5", True),
-        ("nova-2-lite", True),
+        ("nova-2-lite", False),


all-hands-bot

Additional Critical Issues

Following up on my initial review with remaining inline comments:

all-hands-bot · 2026-02-10T18:18:00Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

+        params = get_supported_openai_params(
+            model=model_id, custom_llm_provider=custom_provider
+        )
+    except Exception:


🔴 Critical: Bare except Exception: swallows programming errors, not just API failures.

This catches EVERYTHING: AttributeError, TypeError, and other programming bugs. If get_supported_openai_params has a bug or you pass wrong types, this silently returns False and masks the real problem.

Suggested change

except Exception:

except (ValueError, KeyError, ImportError, RuntimeError):

# LiteLLM API errors - model not found, API failure, etc.

return False

Fail fast on programming errors. Silent failures are how bugs hide for months.

all-hands-bot · 2026-02-10T18:18:00Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

+    except Exception:
+        return False
+
+    return bool(params and "reasoning_effort" in params)


🟡 Suggestion: Redundant bool() call - the expression already returns bool.

params and "reasoning_effort" in params already evaluates to bool. The bool() wrapper adds zero value and obscures intent.

Suggested change

return bool(params and "reasoning_effort" in params)

return params is not None and "reasoning_effort" in params

Clearer: explicitly check params is not None, then check membership. No type coercion theater.

all-hands-bot · 2026-02-10T18:18:01Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

 from dataclasses import dataclass
+from functools import lru_cache
+
+from litellm import get_supported_openai_params


🟠 Important: External dependency risk - you now trust LiteLLM to be correct.

The old hardcoded list was under YOUR control. You audited it, tested it, owned it. Now if LiteLLM's get_supported_openai_params is wrong, you're wrong.

Questions:

What happens when LiteLLM adds a model you don't support yet?

What happens when they remove a model you still support?

What happens when their API changes?

This is a pragmatic trade-off (less maintenance) but you've outsourced correctness. Document this risk and have a fallback plan.

all-hands-bot · 2026-02-10T18:18:01Z

tests/sdk/llm/test_model_features.py

        # Anthropic Opus 4.5 (dash variant only)
        ("claude-opus-4-5", True),
-        ("nova-2-lite", True),
+        ("nova-2-lite", False),


🟠 Important: Breaking behavior change without migration notes.

nova-2-lite went from True → False. Was the old code lying (bug fix), or is this a breaking change? If users rely on this, their code breaks.

Document this:

Add a comment explaining WHY it changed

Consider deprecation warning if this breaks users

Or prove that nova-2-lite never actually supported reasoning_effort (in which case, this is a bug fix)

"Never break userspace" - if this changes user-visible behavior, explain why it's justified.

all-hands-bot

Additional Critical Issues

Following up on my initial review with remaining inline comments:

all-hands-bot · 2026-02-10T18:18:05Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

+        params = get_supported_openai_params(
+            model=model_id, custom_llm_provider=custom_provider
+        )
+    except Exception:


🔴 Critical: Bare except Exception: swallows programming errors, not just API failures.

This catches EVERYTHING: AttributeError, TypeError, and other programming bugs. If get_supported_openai_params has a bug or you pass wrong types, this silently returns False and masks the real problem.

Suggested change

except Exception:

except (ValueError, KeyError, ImportError, RuntimeError):

# LiteLLM API errors - model not found, API failure, etc.

return False

Fail fast on programming errors. Silent failures are how bugs hide for months.

all-hands-bot · 2026-02-10T18:18:05Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

+    except Exception:
+        return False
+
+    return bool(params and "reasoning_effort" in params)


🟡 Suggestion: Redundant bool() call - the expression already returns bool.

params and "reasoning_effort" in params already evaluates to bool. The bool() wrapper adds zero value and obscures intent.

Suggested change

return bool(params and "reasoning_effort" in params)

return params is not None and "reasoning_effort" in params

Clearer: explicitly check params is not None, then check membership. No type coercion theater.

all-hands-bot · 2026-02-10T18:18:05Z

openhands-sdk/openhands/sdk/llm/utils/model_features.py

 from dataclasses import dataclass
+from functools import lru_cache
+
+from litellm import get_supported_openai_params


🟠 Important: External dependency risk - you now trust LiteLLM to be correct.

The old hardcoded list was under YOUR control. You audited it, tested it, owned it. Now if LiteLLM's get_supported_openai_params is wrong, you're wrong.

Questions:

What happens when LiteLLM adds a model you don't support yet?

What happens when they remove a model you still support?

What happens when their API changes?

This is a pragmatic trade-off (less maintenance) but you've outsourced correctness. Document this risk and have a fallback plan.

all-hands-bot · 2026-02-10T18:18:05Z

tests/sdk/llm/test_model_features.py

        # Anthropic Opus 4.5 (dash variant only)
        ("claude-opus-4-5", True),
-        ("nova-2-lite", True),
+        ("nova-2-lite", False),


🟠 Important: Breaking behavior change without migration notes.

nova-2-lite went from True → False. Was the old code lying (bug fix), or is this a breaking change? If users rely on this, their code breaks.

Document this:

Add a comment explaining WHY it changed

Consider deprecation warning if this breaks users

Or prove that nova-2-lite never actually supported reasoning_effort (in which case, this is a bug fix)

"Never break userspace" - if this changes user-visible behavior, explain why it's justified.

all-hands-bot

🟠 Testing Gaps - Critical Missing Coverage

You added complex logic (prefix stripping, provider splitting, exception handling, LRU cache) but only test the happy path through get_features().

Missing test coverage for _supports_reasoning_effort():

Error handling: What happens when get_supported_openai_params raises an exception? Your bare except Exception: hides this.
LRU cache behavior: Does it actually work? Test: Call same model twice, verify only one LiteLLM call made.
Edge cases:
- Empty string: _supports_reasoning_effort("")
- None: _supports_reasoning_effort(None)
- Whitespace-only: _supports_reasoning_effort(" ")
- Malformed provider: _supports_reasoning_effort("a/b/c/d")
Prefix handling: Does litellm_proxy/ prefix stripping work correctly?
- _supports_reasoning_effort("litellm_proxy/o1")
- _supports_reasoning_effort("litellm_proxy/openai/o1")
Custom provider splitting: Does it handle multiple slashes correctly?

Tests should prove the code works, not just assert it was called. Add unit tests for _supports_reasoning_effort directly (yes, it's private, but it has complex logic that deserves dedicated tests).

Bottom line: Without these tests, you're flying blind. When LiteLLM changes or edge cases hit production, you'll have no safety net.

all-hands-bot

🟠 Testing Gaps - Critical Missing Coverage

You added complex logic (prefix stripping, provider splitting, exception handling, LRU cache) but only test the happy path through get_features().

Missing test coverage for _supports_reasoning_effort():

Error handling: What happens when get_supported_openai_params raises an exception? Your bare except Exception: hides this.
LRU cache behavior: Does it actually work? Test: Call same model twice, verify only one LiteLLM call made.
Edge cases:
- Empty string: _supports_reasoning_effort("")
- None: _supports_reasoning_effort(None)
- Whitespace-only: _supports_reasoning_effort(" ")
- Malformed provider: _supports_reasoning_effort("a/b/c/d")
Prefix handling: Does litellm_proxy/ prefix stripping work correctly?
- _supports_reasoning_effort("litellm_proxy/o1")
- _supports_reasoning_effort("litellm_proxy/openai/o1")
Custom provider splitting: Does it handle multiple slashes correctly?

Tests should prove the code works, not just assert it was called. Add unit tests for _supports_reasoning_effort directly (yes, it's private, but it has complex logic that deserves dedicated tests).

Bottom line: Without these tests, you're flying blind. When LiteLLM changes or edge cases hit production, you'll have no safety net.

Co-authored-by: OpenHands Bot <contact@all-hands.dev>

openhands-ai · 2026-02-10T18:29:02Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Run tests
- Agent Server
- Review Thread Gate

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1990 at branch `cleanup-reasoning-models`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

refactor(llm): use litellm params for reasoning support

cfdec3b

Derive reasoning_effort support from get_supported_openai_params and update Nova 2 Lite expectations in tests. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot reviewed Feb 10, 2026

View reviewed changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py Outdated Show resolved Hide resolved

all-hands-bot reviewed Feb 10, 2026

View reviewed changes

Update openhands-sdk/openhands/sdk/llm/utils/model_features.py

626320c

Co-authored-by: OpenHands Bot <contact@all-hands.dev>

	return bool(params and "reasoning_effort" in params)
	return params is not None and "reasoning_effort" in params

refactor(llm): use litellm params for reasoning support #1990

Are you sure you want to change the base?

refactor(llm): use litellm params for reasoning support #1990

Conversation

enyst commented Feb 10, 2026

Summary

Testing

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

🟡 Acceptable - Works but introduces risks without adequate safeguards

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

🟡 Acceptable - Works but introduces risks without adequate safeguards

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Additional Critical Issues

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Additional Critical Issues

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

🟠 Testing Gaps - Critical Missing Coverage

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

🟠 Testing Gaps - Critical Missing Coverage

Uh oh!

openhands-ai bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!