feat(protocol): add thinking param to Chat API and support bare string image_url#1098
feat(protocol): add thinking param to Chat API and support bare string image_url#1098ConnorLi96 wants to merge 1 commit intomainfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 16 minutes and 50 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…g image_url
1. Add Anthropic-style `thinking` field to ChatCompletionRequest.
Normalized to chat_template_kwargs with both `thinking` and
`enable_thinking` keys for Kimi-K2.5 and Qwen3 template compat.
2. Support bare string image_url: accept both `"image_url": "url"`
and `"image_url": {"url": "..."}` via custom serde Deserialize.
3. Reduce tiktoken partial UTF-8 decode log from warn to debug
(expected during streaming of CJK/emoji, not an error).
Includes unit tests for ImageUrl deserialization (bare string, object,
missing url, roundtrip) and thinking normalization.
Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com>
Made-with: Cursor
| ThinkingConfig::Enabled { .. } => { | ||
| kwargs | ||
| .entry("enable_thinking".to_string()) | ||
| .or_insert(Value::Bool(true)); | ||
| kwargs | ||
| .entry("thinking".to_string()) | ||
| .or_insert(Value::Bool(true)); | ||
| } |
There was a problem hiding this comment.
🟡 Nit: budget_tokens is silently discarded here — the Enabled { .. } pattern ignores it, and only a boolean true is forwarded to the template kwargs. Since budget_tokens is a required field on ThinkingConfig::Enabled, every caller must provide it, yet it has no effect on this code path. This could confuse users who expect their token budget to be respected.
Consider either:
- Adding a
tracing::debug!noting the value is ignored (cheap, aids debugging), or - Forwarding it as an additional kwarg (e.g.,
budget_tokens) so future templates can opt in to it.
There was a problem hiding this comment.
Code Review
This pull request introduces support for extended thinking configurations in ChatCompletionRequest by mapping the thinking field to chat_template_kwargs. It also enhances ImageUrl to support deserialization from both string and object formats and downgrades the logging level for tiktoken decode failures. Feedback was provided to ensure the thinking field takes precedence during normalization and to update the JsonSchema for ImageUrl to match its new deserialization capabilities.
| if let Some(ref thinking) = self.thinking { | ||
| let kwargs = self.chat_template_kwargs.get_or_insert_with(HashMap::new); | ||
| match thinking { | ||
| ThinkingConfig::Enabled { .. } => { | ||
| kwargs | ||
| .entry("enable_thinking".to_string()) | ||
| .or_insert(Value::Bool(true)); | ||
| kwargs | ||
| .entry("thinking".to_string()) | ||
| .or_insert(Value::Bool(true)); | ||
| } | ||
| ThinkingConfig::Disabled => { | ||
| kwargs | ||
| .entry("enable_thinking".to_string()) | ||
| .or_insert(Value::Bool(false)); | ||
| kwargs | ||
| .entry("thinking".to_string()) | ||
| .or_insert(Value::Bool(false)); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Using or_insert here creates a logic issue where the explicit thinking field is ignored if the corresponding keys already exist in chat_template_kwargs. For example, if a user sets thinking: Disabled but chat_template_kwargs contains "enable_thinking": true, the model will remain in thinking mode. Since thinking is a first-class field in the request, it should take precedence over the generic chat_template_kwargs during normalization to ensure alignment with the external specification. Additionally, the budget_tokens value from the Enabled variant is currently dropped; consider propagating it to the kwargs as well if the underlying templates support it.
if let Some(ref thinking) = self.thinking {
let kwargs = self.chat_template_kwargs.get_or_insert_with(HashMap::new);
match thinking {
ThinkingConfig::Enabled { .. } => {
kwargs.insert("enable_thinking".to_string(), Value::Bool(true));
kwargs.insert("thinking".to_string(), Value::Bool(true));
}
ThinkingConfig::Disabled => {
kwargs.insert("enable_thinking".to_string(), Value::Bool(false));
kwargs.insert("thinking".to_string(), Value::Bool(false));
}
}
}References
- For protocol data structures that mirror an external API (e.g., OpenAI), prioritize alignment with the external specification over internal consistency.
| #[derive(Debug, Clone, Serialize, PartialEq, schemars::JsonSchema)] | ||
| pub struct ImageUrl { | ||
| pub url: String, | ||
| #[serde(skip_serializing_if = "Option::is_none")] | ||
| pub detail: Option<String>, // "auto", "low", or "high" | ||
| pub detail: Option<String>, | ||
| } |
There was a problem hiding this comment.
The ImageUrl struct now supports deserialization from a bare string via a custom Deserialize implementation, but the derived JsonSchema does not reflect this change. This will result in generated API documentation and client libraries that only allow the object form. To maintain alignment with external specifications that allow both forms, you should implement schemars::JsonSchema manually for ImageUrl to specify that it can be either a string or an object (e.g., using any_of).
References
- For protocol data structures that mirror an external API (e.g., OpenAI), prioritize alignment with the external specification over internal consistency.
| } | ||
|
|
||
| #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, schemars::JsonSchema)] | ||
| #[derive(Debug, Clone, Serialize, PartialEq, schemars::JsonSchema)] |
There was a problem hiding this comment.
🟡 Nit: The schemars::JsonSchema derive still generates a schema for the struct form only (object with url + optional detail). Now that the custom Deserialize impl also accepts a bare string, the generated JSON Schema / OpenAPI docs won't reflect that. Clients doing client-side validation against the schema would reject the bare-string form that the server actually accepts.
If the schema is exposed in API docs, consider a manual JsonSchema impl (or #[schemars(with = ...)]) that emits a oneOf of string and object.
There was a problem hiding this comment.
Clean PR with good test coverage. Two minor nits posted inline — neither is blocking.
Summary: 🟡 Nit x2, 🔴 Important x0
budget_tokenssilently discarded during thinking normalization — worth a debug log or forwarding as a kwargJsonSchemaderive onImageUrldoesn't reflect the new bare-string acceptance
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cf003e3994
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Migrate thinking → chat_template_kwargs. | ||
| // Sets both keys: Qwen3 templates check `enable_thinking`, Kimi-K2.5 checks `thinking`. | ||
| // Uses or_insert to not override user-provided kwargs. | ||
| if let Some(ref thinking) = self.thinking { |
There was a problem hiding this comment.
Drop migrated thinking field after populating template kwargs
In the OpenAI HTTP routing path I checked (model_gateway/src/routers/openai/chat.rs:84-106), the request is serialized from ChatCompletionRequest and then only fields in SGLANG_FIELDS are stripped (model_gateway/src/routers/openai/provider.rs:11-35), which does not include thinking. Because this normalization block migrates thinking into chat_template_kwargs but leaves self.thinking set, any client that now sends thinking will forward an extra unsupported parameter to external OpenAI-compatible backends (a regression from previous behavior where unknown thinking was dropped during deserialization), causing 400-style upstream rejections.
Useful? React with 👍 / 👎.
Problem
thinkingfield)."image_url": "https://..."(bare string) instead of"image_url": {"url": "https://..."}, causing deserialization errors.Solution
thinking: Option<ThinkingConfig>toChatCompletionRequest, reusing the existingThinkingConfigfrom Messages API. Normalized tochat_template_kwargswith bothenable_thinking(Qwen3) andthinking(Kimi-K2.5) keys. Usesor_insertto not override user-provided kwargs.DeserializeonImageUrlto accept both bare string and object forms.warn!todebug!.Behavior change note:
ImageUrlnow uses a custom deserializer. Serialization is unchanged (#[derive(Serialize)]).Changes
crates/protocols/src/chat.rs: addthinkingfield + normalization + unit testscrates/protocols/src/common.rs: customImageUrldeserializer + unit tests (bare string, object, missing url, roundtrip, skip_none)crates/tokenizer/src/tiktoken.rs: log level warn -> debugTest Plan
ImageUrldeserialization + serializationthinkingnormalization (enabled, disabled, no-override, absent)Checklist:
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassescargo test -p openai-protocolpasses (63 tests)Made with Cursor